Skip to content

Commit 5dabb01

Browse files
authored
Merge pull request #1005 from nf-core/dev
Release 5.4.2
2 parents 0c370ba + 69e200c commit 5dabb01

43 files changed

Lines changed: 732 additions & 508 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.nf-core.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,4 +16,4 @@ template:
1616
name: mag
1717
org: nf-core
1818
outdir: .
19-
version: 5.4.1
19+
version: 5.4.2

CHANGELOG.md

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,29 @@
33
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
44
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
55

6+
## 5.4.2 Yellow Frog patch [2026-03-31]
7+
8+
### `Added`
9+
10+
- [#1006](https://github.com/nf-core/mag/pull/1006) - Add citations and references texts in the `utils_nfcore_mag_pipeline` subworkflow (by @dialvarezs)
11+
12+
### `Changed`
13+
14+
- [#1000](https://github.com/nf-core/mag/pull/1000) - Update GUNC modules and enable the tool in tests (by @dialvarezs)
15+
- [#1004](https://github.com/nf-core/mag/pull/1004) - Replace `collectFile` with `qsv/cat` on GUNC summary merging (by @dialvarezs)
16+
17+
### `Fixed`
18+
19+
- [#1001](https://github.com/nf-core/mag/pull/1001) - Include all binners in parameter validation for running DASTool (by @dialvarezs)
20+
- [#1002](https://github.com/nf-core/mag/pull/1002) - Fix BUSCO publish dir to prevent filename collision (by @dialvarezs)
21+
- [#1002](https://github.com/nf-core/mag/pull/1002) - Avoid whole-batch BUSCO failure when classification fails for a single bin. (by @dialvarezs)
22+
23+
### `Dependencies`
24+
25+
| Tool | Previous version | New version |
26+
| ---- | ---------------- | ----------- |
27+
| GUNC | 1.0.6 | 1.1.0 |
28+
629
## 5.4.1 - Yellow Frog patch [2026-03-13]
730

831
### `Changed`
@@ -12,7 +35,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
1235

1336
### `Fixed`
1437

15-
- [#974](https://github.com/nf-core/mag/pull/974) - Re-add accidently removed functionality for using metaSPAdes contigs downstream rather than scaffolds (reported by @Pranjal-Bioinfo, fix by @jfy133)
38+
- [#979](https://github.com/nf-core/mag/pull/979) - Re-add accidently removed functionality for using metaSPAdes contigs downstream rather than scaffolds (reported by @Pranjal-Bioinfo, fix by @jfy133)
1639
- [#984](https://github.com/nf-core/mag/pull/984) - Fix docs regarding usage of Bin QC tool scores when filtering bins for post-processing (by @dialvarezs, @claude)
1740
- [#987](https://github.com/nf-core/mag/pull/987) - Fix several documentation issues (by @dialvarezs)
1841
- [#988](https://github.com/nf-core/mag/pull/988) - Fix regarding validation of column `long_reads_platform` in the input samplesheet ([#985](https://github.com/nf-core/mag/issues/985) by @vinisalazar)

CITATIONS.md

Lines changed: 13 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
1111
## Pipeline tools
1212

13-
- [AdapterRemoval2](https://doi.org/10.1186/)
13+
- [AdapterRemoval2](https://doi.org/10.1186/s13104-016-1900-2)
1414

1515
> Schubert, M., Lindgreen, S., and Orlando, L. 2016. "AdapterRemoval v2: Rapid Adapter Trimming, Identification, and Read Merging." BMC Research Notes 9 (February): 88. doi: 10.1186/s13104-016-1900-2
1616
@@ -24,7 +24,7 @@
2424

2525
> Danecek, Petr, et al. "Twelve years of SAMtools and BCFtools." Gigascience 10.2 (2021): giab008. doi: 10.1093/gigascience/giab008
2626
27-
- [Bowtie2](https:/dx.doi.org/10.1038/nmeth.1923)
27+
- [Bowtie2](https://doi.org/10.1038/nmeth.1923)
2828

2929
> Langmead, B. and Salzberg, S. L. 2012 Fast gapped-read alignment with Bowtie 2. Nature methods, 9(4), p. 357–359. doi: 10.1038/nmeth.1923.
3030
@@ -42,7 +42,7 @@
4242
4343
- [CheckM2](https://doi.org/10.1038/s41592-023-01940-w)
4444

45-
> Chklovski, A., Parks, D. H., Woodcroft, B. J., & Tyson, G. W. (2023). CheckM2: a rapid, scalable and accurate tool for assessing microbial genome quality using machine learning. Nature Methods, 20(8), 1203-1212. doi: https://doi.org/10.1038/s41592-023-01940-w
45+
> Chklovski, A., Parks, D. H., Woodcroft, B. J., & Tyson, G. W. (2023). CheckM2: a rapid, scalable and accurate tool for assessing microbial genome quality using machine learning. Nature Methods, 20(8), 1203-1212. doi: 10.1038/s41592-023-01940-w
4646
4747
- [Chopper](https://doi.org/10.1093/bioinformatics/bty149)
4848

@@ -86,9 +86,9 @@
8686

8787
> Garrison E, Marth G. Haplotype-based variant detection from short-read sequencing. arXiv preprint arXiv:1207.3907 [q-bio.GN] 2012
8888
89-
- [geNomad](https://doi.org/10.1101/2023.03.05.531206)
89+
- [geNomad](https://doi.org/10.1038/s41587-023-01953-y)
9090

91-
> Camargo, A. P., et al. (2023). You can move, but you can’t hide: identification of mobile genetic elements with geNomad. bioRxiv preprint. doi: 10.1101/2023.03.05.531206
91+
> Camargo, A. P., et al. (2023). Identification of mobile genetic elements with geNomad. Nature Biotechnology 42, 1303–1312. doi: 10.1038/s41587-023-01953-y
9292
9393
- [GTDB-Tk](https://doi.org/10.1093/bioinformatics/btz848)
9494

@@ -100,7 +100,7 @@
100100
101101
- [BIgMAG](https://doi.org/10.12688/f1000research.152290.2)
102102

103-
> Yepes-García, J., Falquet, L. (2024). Metagenome quality metrics and taxonomical annotation visualization through the integration of MAGFlow and BIgMAG. F1000Research 13:640. doi.org/10.12688/f1000research.152290.2
103+
> Yepes-García, J., Falquet, L. (2024). Metagenome quality metrics and taxonomical annotation visualization through the integration of MAGFlow and BIgMAG. F1000Research 13:640. doi: 10.12688/f1000research.152290.2
104104
105105
- [MaxBin2](https://doi.org/10.1093/bioinformatics/btv638)
106106

@@ -116,19 +116,19 @@
116116
117117
- [MetaEuk](https://doi.org/10.1186/s40168-020-00808-x)
118118

119-
> Levy Karin, E., Mirdita, M. & Söding, J. MetaEuk—sensitive, high-throughput gene discovery, and annotation for large-scale eukaryotic metagenomics. Microbiome 8, 48 (2020). 10.1186/s40168-020-00808-x
119+
> Levy Karin, E., Mirdita, M. & Söding, J. MetaEuk—sensitive, high-throughput gene discovery, and annotation for large-scale eukaryotic metagenomics. Microbiome 8, 48 (2020). doi: 10.1186/s40168-020-00808-x
120120
121121
- [metaMDBG](https://doi.org/10.1038/s41587-023-01983-6)
122122

123-
> Benoit, G., Raguideau, S., James, R. et al. High-quality metagenome assembly from long accurate reads with metaMDBG. Nat Biotechnol 42, 1378–1383 (2024). doi:10.1038/s41587-023-01983-6
123+
> Benoit, G., Raguideau, S., James, R. et al. High-quality metagenome assembly from long accurate reads with metaMDBG. Nat Biotechnol 42, 1378–1383 (2024). doi: 10.1038/s41587-023-01983-6
124124
125125
- [minimap2](https://doi.org/10.1093/bioinformatics/bty191)
126126

127127
> Li, H. (2018). Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics , 34(18), 3094–3100. doi: 10.1093/bioinformatics/bty191
128128
129129
- [MMseqs2](https://www.nature.com/articles/nbt.3988)
130130

131-
> Steinegger, M., Söding, J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat Biotechnol 35, 1026–1028 (2017).10.1038/nbt.3988
131+
> Steinegger, M., Söding, J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat Biotechnol 35, 1026–1028 (2017). doi: 10.1038/nbt.3988
132132
133133
- [MultiQC](https://pubmed.ncbi.nlm.nih.gov/27312411/)
134134

@@ -178,6 +178,10 @@
178178

179179
> Karlicki, M., Antonowicz, S., Karnkowska, A., 2022. Tiara: deep learning-based classification system for eukaryotic sequences. Bioinformatics 38, 344–350. doi: 10.1093/bioinformatics/btab672
180180
181+
- [Trimmomatic](https://doi.org/10.1093/bioinformatics/btu170)
182+
183+
> Bolger, A. M., Lohse, M., & Usadel, B. (2014). Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics, 30(15), 2114-2120. doi: 10.1093/bioinformatics/btu170
184+
181185
## Data
182186

183187
- [Full-size test data](https://doi.org/10.1038/s41587-019-0191-2)

assets/methods_description_template.yml

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,6 @@ description: "Suggested text and references to use when describing pipeline usag
33
section_name: "nf-core/mag Methods Description"
44
section_href: "https://github.com/nf-core/mag"
55
plot_type: "html"
6-
## TODO nf-core: Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline
7-
## You inject any metadata in the Nextflow '${workflow}' object
86
data: |
97
<h4>Methods</h4>
108
<p>Data was processed using nf-core/mag v${workflow.manifest.version} (${doi_text}; <a href="https://doi.org/10.1093/nargab/lqac007">Krakau <em>et al.</em>, 2022</a>) of the nf-core collection of workflows (<a href="https://doi.org/10.1038/s41587-020-0439-x">Ewels <em>et al.</em>, 2020</a>), utilising reproducible software environments from the Bioconda (<a href="https://doi.org/10.1038/s41592-018-0046-7">Grüning <em>et al.</em>, 2018</a>) and Biocontainers (<a href="https://doi.org/10.1093/bioinformatics/btx192">da Veiga Leprevost <em>et al.</em>, 2017</a>) projects.</p>

assets/multiqc_config.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
report_comment: >
2-
This report has been generated by the <a href="https://github.com/nf-core/mag/releases/tag/5.4.1" target="_blank">nf-core/mag</a> analysis pipeline. For information about how to interpret these results, please see the <a href="https://nf-co.re/mag/5.4.1/docs/output" target="_blank">documentation</a>.
2+
This report has been generated by the <a href="https://github.com/nf-core/mag/releases/tag/5.4.2" target="_blank">nf-core/mag</a> analysis pipeline. For information about how to interpret these results, please see the <a href="https://nf-co.re/mag/5.4.2/docs/output" target="_blank">documentation</a>.
33
report_section_order:
44
"nf-core-mag-methods-description":
55
order: -1000

conf/base.config

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -183,4 +183,8 @@ process {
183183
withName: CHECKM2_PREDICT {
184184
errorStrategy = { task.exitStatus in ((130..145) + 104 + 175) ? 'retry' : task.exitStatus == 1 ? 'ignore' : 'finish' }
185185
}
186+
withName: 'CONCAT_BUSCO_TSV|CONCAT_CHECKM_TSV|CONCAT_CHECKM2_TSV|CONCAT_GUNC_TSV|CONCAT_GUNC_CHECKM_TSV' {
187+
cpus = 1
188+
memory = { 1.GB * task.attempt }
189+
}
186190
}

conf/modules.config

Lines changed: 6 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -505,13 +505,11 @@ process {
505505
}
506506

507507
withName: BUSCO_BUSCO {
508-
tag = { "${meta.assembler}-${meta.binner}-${meta.id}" }
509-
ext.args = [
510-
params.busco_db ? '--offline' : ''
511-
].join(' ').trim()
508+
tag = { "${meta.assembler}-${meta.binner}-${meta.domain}-${meta.refinement}-${meta.id}" }
509+
ext.args = params.busco_db ? '--offline' : ''
512510
publishDir = [
513511
[
514-
path: { "${params.outdir}/GenomeBinning/QC/BUSCO/${meta.assembler}-${meta.binner}-${meta.id}" },
512+
path: { "${params.outdir}/GenomeBinning/QC/BUSCO/${meta.assembler}-${meta.binner}-${meta.domain}-${meta.refinement}-${meta.id}" },
515513
mode: params.publish_dir_mode,
516514
pattern: "*{.txt,.json,.log,-busco}",
517515
],
@@ -555,13 +553,12 @@ process {
555553
]
556554
}
557555

558-
withName: 'CONCAT_BUSCO_TSV|CONCAT_CHECKM_TSV|CONCAT_CHECKM2_TSV' {
556+
withName: 'CONCAT_BUSCO_TSV|CONCAT_CHECKM_TSV|CONCAT_CHECKM2_TSV|CONCAT_GUNC_TSV|CONCAT_GUNC_CHECKM_TSV' {
559557
ext.prefix = { "${meta.id}_summary" }
560558
ext.args = "--delimiter '\t'"
561559
publishDir = [
562560
path: { "${params.outdir}/GenomeBinning/QC" },
563561
mode: params.publish_dir_mode,
564-
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
565562
]
566563
}
567564

@@ -594,19 +591,17 @@ process {
594591
]
595592
}
596593

597-
// Make sure to keep directory in sync with gunc_qc.nf
598594
withName: GUNC_RUN {
599595
publishDir = [
600-
path: { "${params.outdir}/GenomeBinning/QC/GUNC/raw/${meta.assembler}-${meta.binner}-${meta.domain}-${meta.refinement}-${meta.id}/${fasta.baseName}/" },
596+
path: { "${params.outdir}/GenomeBinning/QC/GUNC/raw/${meta.assembler}-${meta.binner}-${meta.domain}-${meta.refinement}-${meta.id}/" },
601597
mode: params.publish_dir_mode,
602598
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
603599
]
604600
}
605601

606-
// Make sure to keep directory in sync with gunc_qc.nf
607602
withName: GUNC_MERGECHECKM {
608603
publishDir = [
609-
path: { "${params.outdir}/GenomeBinning/QC/GUNC/checkmmerged/${meta.assembler}-${meta.binner}-${meta.domain}-${meta.refinement}-${meta.id}/${checkm_file.baseName}" },
604+
path: { "${params.outdir}/GenomeBinning/QC/GUNC/checkmmerged/${meta.assembler}-${meta.binner}-${meta.domain}-${meta.refinement}-${meta.id}/" },
610605
mode: params.publish_dir_mode,
611606
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
612607
]

conf/test_assembly_input.config

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -55,11 +55,8 @@ params {
5555
postbinning_input = 'refined_bins_only'
5656
exclude_unbins_from_postbinning = true
5757

58-
// TODO: enable when we have a suitable way to run a small test
59-
// GUNC fails with exit code 1 if no matches, see https://github.com/grp-bork/gunc/issues/42
60-
// Solving this will make it possible to generate the BIgMAG file in tests too
61-
run_gunc = false
62-
gunc_db = params.pipelines_testdata_base_path + 'mag/databases/gunc/gunc-mock.dmnd'
58+
run_gunc = true
59+
gunc_database_type = 'test_data'
6360

6461
skip_metaeuk = false
6562
metaeuk_mmseqs_db = 'Kalamari'

docs/output.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -617,7 +617,7 @@ If a lineage dataset is specified already with `--busco_db`, only results for th
617617
<details markdown="1">
618618
<summary>Output files</summary>
619619

620-
- `GenomeBinning/QC/BUSCO/[assembler]-[binner]-[sample/group]/`
620+
- `GenomeBinning/QC/BUSCO/[assembler]-[binner]-[domain]-[refinement]-[sample/group]/`
621621
- `[sample/group]-[lineage]-busco.batch_summary.txt`: Summary table of the BUSCO results for the bins in the sample.
622622
- `short_summary.generic.[lineage].[assembler]-[bin].{txt,json}`: A detailed BUSCO summary for each bin, available in both plain text and JSON format.
623623
- `[sample/group]-[lineage]-busco.log`: Log file of the BUSCO run.
@@ -739,9 +739,9 @@ Besides the reference files or output files created by CheckM, the following sum
739739
- `[gunc-database].dmnd`
740740
- `GUNC/`
741741
- `raw/`
742-
- `[assembler]-[binner]-[domain]-[refinement]-[sample/group]/[fasta input file name]/GUNC_checkM.merged.tsv`: Per sample GUNC [output](https://grp-bork.embl-community.io/gunc/output.html) containing with taxonomic and completeness QC statistics.
742+
- `[assembler]-[binner]-[domain]-[refinement]-[sample/group]/GUNC_checkM.merged.tsv`: Per sample GUNC [output](https://grp-bork.embl-community.io/gunc/output.html) containing with taxonomic and completeness QC statistics.
743743
- `checkmmerged/`
744-
- `[assembler]-[binner]-[domain]-[refinement]-[sample/group]/[checkm input file name]/GUNC.progenomes_2.1.maxCSS_level.tsv`: Per sample GUNC output merged with output from [CheckM](#checkm)
744+
- `[assembler]-[binner]-[domain]-[refinement]-[sample/group]/GUNC.progenomes_2.1.maxCSS_level.tsv`: Per sample GUNC output merged with output from [CheckM](#checkm)
745745

746746
</details>
747747

main.nf

Lines changed: 0 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -20,17 +20,6 @@ include { PIPELINE_INITIALISATION } from './subworkflows/local/utils_nfcore_mag_
2020
include { PIPELINE_COMPLETION } from './subworkflows/local/utils_nfcore_mag_pipeline'
2121
include { getGenomeAttribute } from './subworkflows/local/utils_nfcore_mag_pipeline'
2222

23-
/*
24-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
25-
GENOME PARAMETER VALUES
26-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
27-
*/
28-
29-
// TODO nf-core: Remove this line if you don't need a FASTA file [TODO: try and test using for --host_fasta and --host_genome]
30-
// This is an example of how to use getGenomeAttribute() to fetch parameters
31-
// from igenomes.config using `--genome`
32-
// params.fasta = WorkflowMain.getGenomeAttribute(params, 'fasta')
33-
3423
/*
3524
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3625
NAMED WORKFLOWS FOR PIPELINE

0 commit comments

Comments
 (0)