Introduction
With PhyloNetworks installed, we can load the package and start using it to read, manipulate, and analyze phylogenetic trees and networks in Julia.
julia> using PhyloNetworks
Here is a very small test to see if we correctly installed and loaded PhyloNetworks.
julia> net = readnewick("(A,(B,(C,D)));");
julia> tiplabels(net)
4-element Vector{String}: "A" "B" "C" "D"
You can see a list of all the functions with
varinfo(PhyloNetworks)
and press ?
inside Julia to switch to help mode, followed by the name of a function (or type) to get more details about it.
Often you may wish to work in the directory that contains your data. To change the directory used by julia in a session, say the "examples" folder found in the you have 2 options:
- quit your session, navigate to the directory and restart julia there.
- or change the working directory within your Julia session by using the
cd()
.
The following code changes the working directory to the examples
folder within PhyloNetworks' source directory.
julia> examples_path = joinpath(dirname(dirname(pathof(PhyloNetworks))), "examples");
julia> cd(examples_path)
You will need to set the path to the folder where your data are located.
Julia types
Each object in Julia has a type. We show here small examples on how to get more info on an object. If we want to know the type of a particular object, use typeof
For example, let's read a list of gene trees that come with the package. First, we need the file name. Assuming we are in the "examples" folder:
julia> raxmltreefile = joinpath(examples_path, "raxmltrees.tre") # raxmltreefile = "raxmltrees.tre" # if your working directory contains the file
"/home/runner/work/PhyloNetworks.jl/PhyloNetworks.jl/examples/raxmltrees.tre"
julia> typeof(raxmltreefile)
String
The object raxmltreefile
is a basic string (of letters). Let's create our list of gene trees by reading this file. Note that if you changed your working directory as mentioned above, you do not need to use joinpath
to join the path to the examples
folder with the file name.
julia> genetrees = readmultinewick(raxmltreefile); # the semicolon suppresses info on the result
julia> typeof(genetrees)
Vector{HybridNetwork} (alias for Array{HybridNetwork, 1})
which shows us that genetrees
is of type Vector{HybridNetwork}
, that is, a vector containing networks. If we want to know about the attributes the object has, we can type ?
in Julia, followed by HybridNetwork
for a description.
Typing varinfo()
will provide a list of objects and packages in memory, including raxmltreefile
and genetrees
that we just created.
Quick start
Here we could check the length of our list of gene trees, as a check for correctness to make sure we have all gene trees we expected, and check that the third tree has whatever taxon names we expected:
julia> length(genetrees)
30
julia> tiplabels(genetrees[3])
6-element Vector{String}: "E" "A" "B" "C" "D" "O"
We can also see some basic information on the third gene tree, say:
julia> genetrees[3]
HybridNetwork, Rooted Network 9 edges 10 nodes: 6 tips, 0 hybrid nodes, 4 internal tree nodes. tip labels: E, A, B, C, ... ((E:0.015,(A:0.006,B:0.006):0.003):0.041,(C:0.006,D:0.0):0.041,O:0.052);
To visualize any of these gene trees, use the PhyloPlots package:
julia> using PhyloPlots
julia> plot(genetrees[3]); # tree for 3rd gene
Phylogenetic networks
In phylogenetics, there two types of networks:
Explicit networks have a biological interpretation: internal nodes represent ancestral species (or populations); the main evolutionary history is depicted by the "major tree". Various methods that estimate explicit networks use models that account for ILS and for gene tree estimation error.
Implicit networks are typically descriptive: internal nodes do not represent ancestral species. Implicit networks do not discriminate between ILS, gene flow/hybridization or gene tree estimation error, and can be hard to interpret biologically.

In PhyloNetworks, we consider explicit phylogenetic networks exclusively.
Extended newick format
In parenthetical format, internal nodes can have a name, like node C
below, in a tree written as (A,B)C
in newick format:

To represent networks in parenthetical format, the extended newick format splits each hybrid node into two nodes with the same name:

By convention, the hybrid tag is # + H,LGT,R + number
, and the minor hybrid edge leads to a leaf.
Thus, we get: (((A,(B)#H1),(C,#H1)),D);
. We can write inheritance probabilities in the parenthetical format: (C,#H1):branch length:bootstrap support:inheritance probability
.
We can read a network from a newick-formatted string, and, for example, print a list of its edges:
julia> newickstring = "(((A,(B)#H1),(C,#H1)),D);";
julia> net = readnewick(newickstring);
julia> printedges(net)
edge parent child length hybrid ismajor gamma containroot i_cycle 1 -4 1 false true 1 true -1 2 3 2 false true 1 false -1 3 -4 3 true true true -1 4 -3 -4 false true 1 true -1 5 -6 4 false true 1 true -1 6 -6 3 true false true -1 7 -3 -6 false true 1 true -1 8 -2 -3 false true 1 true -1 9 -2 5 false true 1 true -1
We see that the edges do not have branch lengths, and the hybrid edges do not have gamma (inheritance) values. We can set them with
julia> setlength!(net.edge[1], 1.9)
julia> setgamma!(net.edge[3], 0.8)
julia> printedges(net)
edge parent child length hybrid ismajor gamma containroot i_cycle 1 -4 1 1.900 false true 1 true -1 2 3 2 false true 1 false -1 3 -4 3 true true 0.8 true -1 4 -3 -4 false true 1 true -1 5 -6 4 false true 1 true -1 6 -6 3 true false 0.2 true -1 7 -3 -6 false true 1 true -1 8 -2 -3 false true 1 true -1 9 -2 5 false true 1 true -1
where 1 and 3 correspond to the position of the given edge to modify in the list of edges. We can only change the γ value of hybrid edges, not tree edges (for which γ=1 necessarily). Such an attempt below will cause an error with a message to explain that the edge was a tree edge:
setgamma!(net.edge[4], 0.7)
# should return this:
# ERROR: cannot change gamma in a tree edge