modr install
¶
1. About¶
The modr
executable is composed of several inter-related sub commands. Please see modr -h
for all available options.
This part of the documentation describes options and concepts for modr install
sub command in more detail.
With minimal configuration, the install
sub command enables you to download the pipeline's resource bundle locally. This is necessary when setting up the pipeline on a new target system or cluster.
The pipeline uses a set of reference files to process the data. These reference files are required and need to be available on the local file system prior to execution. This command can be used to download any required reference files of the pipeline.
Since most resource bundles are very large; we recommend using multiple threads for pulling reference files concurrently. The resource bundle can be very large so please ensure you have sufficent disk space prior to running this sub command.
Please Note: The resource bundle requires about 2 GB of available disk space. If you are running the pipeline on the Biowulf cluster, you do NOT need to download the pipeline's resource bundle. It is already accessible to all HPC users. This sub command is for users running the pipeline outside of the Biowulf cluster.
Downloading the resource bundle is fast and easy! In its most basic form, modr install
only has one required input.
2. Synopsis¶
$ modr install [--help] [--dry-run] \
[--force] [--threads] \
--ref-path REF_PATH
The synopsis for each command shows its parameters and their usage. Optional parameters are shown in square brackets.
A user must provide a output directory for the reference file download via the --ref-path
argument. Once the download of the resource bundle has completed, a new child directory called modr will be created. This new directory will contain all of the pipeline's required reference files. The path to this new directory can be passed to the --resource-bundle
option of the modr run
subcomand. This allow users outside of Biowulf to run the pipeline.
Use you can always use the -h
option for information on a specific command.
2.1 Required Arguments¶
--ref-path REF_PATH
Path where the resource bundle will be downloaded.
type: pathAny resouces defined in the 'config/install.json' will be pulled onto the local filesystem. After the files have been downloaded, a new directory with the name
modr
will be created. It contains all the required reference files of the pipeline. The path to this new directory can be passed to the run sub command's--resource-bundle
option. Please see the run sub command for more information.Example:
--ref-path /data/$USER/refs
2.2 Options¶
Each of the following arguments are optional and do not need to be provided.
-h, --help
Display Help.
type: boolean flagShows command's synopsis, help message, and an example command
Example:
--help
--dry-run
Dry run the pipeline.
type: boolean flagDisplays what remote resources would be pulled. Does not execute anything!
Example:
--dry-run
--force
Force downloads all files.
type: boolean flagBy default, any files that do not exist locally are pulled; however if a previous instance of an install did not exit gracefully, it may be necessary to forcefully re-download all the files.
Example:
--force
--threads
Number of threads to use for concurrent file downloads.
type: int
default: 2Max number of threads to use for concurrent file downloads.
Example:
--threads 12
3. Example¶
# Step 0.) Grab an interactive node,
# do not run on head node!
srun -N 1 -n 1 --time=12:00:00 -p interactive --mem=24gb --cpus-per-task=12 --pty bash
module purge
module load singularity snakemake
# Step 1.) Dry-run download of the resource bundle
modr install --ref-path /data/$USER/refs \
--force \
--dry-run \
--threads 12
# Step 2.) Download the resource bundle,
# This command will NOT automatically submit
# a job to the cluster. As so, we recommend
# submitting this next command to the cluster
# as a job. Download speeds will vary so it
# is best to set the wall time to 2 days.
modr install --ref-path /data/$USER/refs \
--force \
--threads 12
# Checkout the downloaded files
cd /data/$USER/refs
tree modr
# modr/
# ├── kronatax_1222
# │ └── taxonomy.tab
# └── NCBI
# └── viral_genomes_taxid.fa