This documentation pertains to SNaQ v1.1 and may differ from the specific implementation originally described in Solís-Lemus & Ané (2016). See documentation SNaQ v1.0 for the original implementation.
Improving runtimes
Parallel runs
For network estimation, multiple runs can done in parallel. For example, if your machine has 4 or more processors (or cores), you can tell julia to use 4 processors by starting julia with julia -p3
, or by starting julia the usual way (julia
) and then adding processors with:
using Distributed
addprocs(3)
The -p
argument for julia
denotes additional processors. So, julia -p3
will start Julia with 4 processors.
If we load a package (using SNaQ
) before adding processors, then we need to re-load it again so that all processors have access to it:
@everywhere using PhyloNetworks, SNaQ
After that, running the snaq!(...)
command will use different cores for the different independent runs specified by optional runs
argument, as processors become available.
When running bootsnaq
, the analysis of each bootstrap replicate will use multiple cores to parallelize separate runs of that particular bootstrap replicate. You may parallelize things further by running bootsnaq
multiple times (on separate machines for instance), each time for a small subset of bootstrap replicates, and with a different seed each time.
We may tell julia to add more processors than our machine has, but we will not receive any performance benefits. At any time during the julia session, nworkers()
tells us how many worker processors julia has access to.
Below is an example of how to use a cluster, to run many independent snaq!
searches in parallel on a cluster running the slurm job manager (other managers would require a different, but similar submit file). This example uses 2 files:
- a julia script file, to do many runs of
snaq!
in parallel, asking for many cores (default: 10 runs, asking for 10 cores). This julia script can take arguments: the maximum allowed number of hybridizationshmax
, and the number of runs (to run 50 runs instead of 10, say). - a submit file, to launch the julia script.
First: the example julia script, below or here, is assumed (by the submit file) to be called runSNaQ.jl
. It uses a starting tree that is assumed to be available in a file named astraltree.tre
, but that could be modified (to use a network with h=1 to start the search with hmax=2 for instance). It also assumes that the quartet concordance factor data are in file tableCF_speciesNames.csv
. Again, this file name should be adjusted. To run this julia script for 50 runs and hmax=3, do julia runSNaQ.jl 3 50
.
#!/usr/bin/env julia
# file "runSNaQ.jl". run in the shell like this in general:
# julia runSNaQ.jl hvalue nruns
# example for h=2 and default 10 runs:
# julia runSNaQ.jl 2
# or example for h=3 and 50 runs:
# julia runSNaQ.jl 3 50
length(ARGS) > 0 ||
error("need 1 or 2 arguments: # reticulations (h) and # runs (optional, 10 by default)")
h = parse(Int, ARGS[1])
nruns = 10
if length(ARGS) > 1
nruns = parse(Int, ARGS[2])
end
outputfile = string("net", h, "_", nruns, "runs") # example: "net2_10runs"
seed = 1234 + h # change as desired! Best to have it different for different h
@info "will run SNaQ with h=$h, # of runs=$nruns, seed=$seed, output will go to: $outputfile"
using Distributed
addprocs(nruns)
@everywhere using SNaQ
net0 = readnewick("astraltree.tre");
using DataFrames, CSV
df_sp = CSV.read("tableCF_speciesNames.csv", DataFrame; pool=false);
d_sp = readtableCF!(df_sp);
net = snaq!(net0, d_sp, hmax=2, filename=outputfile, seed=seed, runs=nruns)
When julia is called on a script, whatever comes after "julia scriptname" is given to julia in an array of values. This array is called ARGS
. So if we call a script like this: julia runSNaQ.jl 2
then the script will know the arguments through ARGS
, which would contain a single element, "2"
. This first element is just a string, at this stage. We want to use it as a number, so we need to ask julia to parse the string into an integer.
Second: we need a "submit" file to ask a job scheduler like slurm to submit our julia script to a cluster. In the submit file below, the first 5 lines set things up for slurm. They are most likely to be specific to your cluster. The main idea here is to use a slurm "array" from 0 to 3, to run our julia script multiple times, 4 times actually: from hmax=0 to hmax=3. Each would do 30 runs (and each would be allocated 30 cores in the submit script below). Then log out of the cluster and go for coffee.
#!/bin/bash
#SBATCH -o path/to/slurm/log/file/runsnaq_slurm%a.log
#SBATCH -J runsnaq
#SBATCH --array=0-3
#SBATCH -c 30
## --array: to run multiple instances of this script,
## one for each value in the array.
## 1 instance = 1 task
## -J job name
## -c number of cores (CPUs) per task
echo "slurm task ID = $SLURM_ARRAY_TASK_ID used as hmax"
echo "start of SNaQ parallel runs on $(hostname)"
# finally: launch the julia script, using Julia executable appropriate for slurm, with full paths:
/workspace/software/bin/julia --history-file=no -- runSNaQ.jl $SLURM_ARRAY_TASK_ID 30 > net${SLURM_ARRAY_TASK_ID}_30runs.screenlog 2>&1
echo "end of SNaQ run ..."
Parallel quartet likelihood
This section describes the functionalities for scalability improvements in Kolbow et al 2025 for SNaQ (v1.1).
Each step of optimization involves computing the likelihood of each quartet. Since SNaQ treats quartet likelihoods as independent, their likelihoods can be computed in parallel with multi-threading. To enable multi-threading, the user needs to specify how many threads are avaliable when starting a Julia session with the --threads
flag:
julia --threads=8 #use 8 threads
SNaQ then automatically multi-threads quartet likelihoods, if given the opportunity. Setting --threads=auto
uses all avaliable CPU threads.
Quartet subsampling
For a network with $N$ taxa, there are $\binom{N}{4}$ different quartets, meaning that the complexity of likelihood computation balloons quartically with respect to the number of taxa. In cases where the number of taxa causes network estimation to be prohibitively slow, we implemented a strategy that only uses a fraction of all quartets when computing the likelihood. For a network with $\binom{N}{4}$ quartets, the optional propQuartets
argument can be used to randomly sample $\lceil \binom{N}{4} \cdot$ propQuartets
$\rceil$ quartets. Although we lose some information when subsampling quartets, using propQuartets
as low as 0.5
has been shown to not signifcantly decrease accuracy.
We can run the same analysis as the Estimating a network section and comapre the two networks when we use only a fraction of the quartets.
raxmltrees = joinpath(dirname(pathof(SNaQ)), "..","examples","raxmltrees.tre");
raxmlCF = readtrees2CF(raxmltrees)
astralfile = joinpath(dirname(pathof(SNaQ)), "..","examples","astral.tre");
astraltree = readmultinewick(astralfile)[102] # 102th tree: last tree here
net0 = snaq!(astraltree,raxmlCF, hmax=0, filename="net0", propQuartets=0.75)
Removing uninformative quartets
We can further reduce computational costs with minimal detriment to accuracy by ignoring uniformative quartets. A star tree would give concordance factors of $\frac{1}{3}$ for all quartet topologies, thus, quartets with CFs near $\frac{1}{3}$ may not be informative of the overall species topology. We can check for and remove uninformative quartets by setting the optional keyword argument qinfTest
to true
. Any quartets with concordance factors sufficiently close to $\frac{1}{3}$ will be removed when computing the composite likelihood. Further, the optional keyword argument qtolAbs
can be used to specify the tolerence for determining what concordance factors are "close enough" to $\frac{1}{3}$.