Description
To whom it may concern,
Thank you for the templates to use on HPC :D
I'm triggering an R target pipeline on snakemake together with R docker image on singularity. The pipeline:
`#!/usr/bin/env R
work_dir <- "06-fungal-control"
source(here::here(paste("src", work_dir, "defaults.R", sep = "/")))
tar_option_set(packages = c("tidyverse","tarchetypes"),
format = "qs",
memory = "transient",
garbage_collection = TRUE,
storage = "worker",
retrieval = "worker")
library(future)
library(future.batchtools)
future::plan(
tweak(
future.batchtools::batchtools_slurm,
template="src/06-fungal-control/slurm.tmpl",
resources=list(
walltime=259200,#minutes
memory=62500,
ncpus=4,
ntasks=1,
partition="standard",
chunks.as.arrayjobs=TRUE)
)
)
list(
tar_target(
metadata,
read_tsv("raw/04-tedersoo-global-mycobiome/Tedersoo L, Mikryukov V, Anslan S et al. Fungi_GSMc_sample_metadata.txt")
),
tar_target(
continent_countries,
read_csv("raw/05-countries-continent/countries.csv")
),
tar_target(
subset_samples,
european_samples(metadata, continent_countries)
),
tar_target(
raw_abundance,
read_tsv("raw/04-tedersoo-global-mycobiome/Fungi_GSMc_OTU_Table.txt")
),
tar_target(
taxonomy,
get_taxonomy("raw/04-tedersoo-global-mycobiome/Tedersoo L, Mikryukov V, Anslan S et al. Fungi_GSMc_data_biom.biom")
),
tar_target(
raw_abundance_long,
long_abundance(raw_abundance, subset_samples)
)
)`
However, it doesn't work. R complains that the squeue command is not found. Here's the log:
`Date = Tue May 16 10:59:08 CEST 2023
Hostname = node069
Working Directory = /home/qi47rin/proj/02-compost-microbes/src/06-fungal-control
Number of Nodes Allocated = 1
Number of Tasks Allocated = 1
Number of Cores/Task Allocated = 1
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job stats:
job count min threads max threads
get_fungal_spikein 1 1 1
targets 1 1 1
total 2 1 1
Select jobs to execute...
[Tue May 16 10:59:15 2023]
rule get_fungal_spikein:
input: src/06-fungal-control/analyze_server.R
output: logs/06-fungal-control/spike.log
jobid: 1
reason: Missing output files: logs/06-fungal-control/spike.log
resources: tmpdir=/tmp
Activating singularity image /home/qi47rin/proj/02-compost-microbes/.snakemake/singularity/8c1aaca4ec464428d6d90db9c1dc0fbf.simg
running
'/usr/local/lib/R/bin/R --no-echo --no-restore --no-save --no-restore --file=src/06-fungal-control/analyze_server.R'
here() starts at /home/qi47rin/proj/02-compost-microbes
Global env bootstraped.
here() starts at /home/qi47rin/proj/02-compost-microbes
Global env bootstraped.
✔ skip target continent_countries
✔ skip target metadata
✔ skip target subset_samples
✔ skip target taxonomy
• start target raw_abundance
✔ skip pipeline
Warning message:
In readLines(template) :
incomplete final line found on '/home/qi47rin/proj/02-compost-microbes/src/06-fungal-control/slurm.tmpl'
Error : Listing of jobs failed (exit code 127);
cmd: 'squeue --user=$USER --states=R,S,CG --noheader --format=%i -r'
output:
command not found
Error in tar_throw_run()
:
! ! in callr subprocess.
Caused by error:
! Listing of jobs failed (exit code 127);
cmd: 'squeue --user=$USER --states=R,S,CG --noheader --format=%i -r'
output:
command not found
Visit https://books.ropensci.org/targets/debugging.html for debugging advice.
Backtrace:
▆
- └─targets::tar_make_future(workers = 4)
- └─targets:::callr_outer(...)
-
└─base::tryCatch(...)
-
└─base (local) tryCatchList(expr, classes, parentenv, handlers)
-
└─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
-
└─value[[3L]](cond)
-
└─targets::tar_throw_run(...)
-
└─rlang::abort(...)
Execution halted
[Tue May 16 10:59:27 2023]
Error in rule get_fungal_spikein:
jobid: 1
output: logs/06-fungal-control/spike.log
shell:
Rscript --no-save --no-restore --verbose src/06-fungal-control/analyze_server.R | tee logs/06-fungal-control/spike.log
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Removing output files of failed job get_fungal_spikein since they might be corrupted:
logs/06-fungal-control/spike.log
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: src/06-fungal-control/.snakemake/log/2023-05-16T105913.649364.snakemake.log`
It worked before with conda, because when you activate an environment, every app remains available. However in a container, there are problems when squeue do queries regard user id and the slurm system id. I also tried to mount slurm volumes ... but it didn't work either. Then, is there a way to avoid the squeue command when using tar_make_future to trigger jobs on slurm?
Thanks is advance,
AIlton.