internal documentation

Documentation for PhyloNetworks's internal functions. These functions are not exported and their access (API) should not be considered stable. But they can still be used, like this for example: PhyloNetworks.foo() for a function named foo().

functions & types

PhyloNetworks.ANodeType
ANode

Abstract node. An object of type EdgeT has a node attribute, which is an vector of 2 objects of some subtype of ANode. The concrete type Node is a subtype of ANode, and has an edge attribute, which is vector of Edge objects (where Edge is an alias for EdgeT{Node}).

source
PhyloNetworks.EdgeTType
EdgeT{node type}
Edge = EdgeT{Node}
Edge(number, length=1.0)

Data structure for an edge and its various attributes. Most notably:

  • number (integer): serves as unique identifier; remains unchanged when the network is modified, with a nearest neighbor interchange for example
  • node: vector of Nodes, normally just 2 of them
  • ischild1 (boolean): true if node[1] is the child node of the edge, false if node[1] is the parent node of the edge
  • length: branch length
  • hybrid (boolean): whether the edge is a tree edge or a hybrid edge (in which case ischild1 is important, even if the network is semi-directed)
  • gamma: proportion of genetic material inherited by the child node via the edge; 1.0 for a tree edge
  • ismajor (boolean): whether the edge is the major path to the child node; true for tree edges, since a tree edge is the only path to its child node; normally true if gamma>0.5.
  • containroot (boolean): is the interior of this edge a valid rooting position? That is, if we added a node on the edge, could we direct all edges away from this new node to root the semidirected network and get a valid rooted network?

and other fields, used very internally

source
PhyloNetworks.MatrixTopologicalOrderType
MatrixTopologicalOrder

Matrix associated to a HybridNetwork in which rows and/or columns correspond to nodes in the network. In this matrix, nodes are indexed by their topological order. For example, if rows list nodes in network net, then V[i,:] corresponds to node net.vec_node[i].

The following functions and extractors can be applied to it: tiplabels, obj[:tips], obj[:internalnodes], obj[:tipsnodes]. See documentation for getindex(::MatrixTopologicalOrder, ::Symbol)).

A MatrixTopologicalOrder object has field V (the matrix of interest) and fields for mapping indices in V to node numbers nodenumbers_toporder, internalnodenumbers, tipnumbers, tipnames, indexation. Type in "?MatrixTopologicalOrder.field" to get documentation on a specific field.

source
PhyloNetworks.NodeType
Node(number, leaf)
Node(number, leaf, hybrid)

Data structure for a node and its various attributes. Most notably:

  • number (integer): serves as unique identifier; remains unchanged when the network is modified, with a nearest neighbor interchange for example
  • leaf (boolean): whether the node is a leaf (with data typically) or an internal node (no data typically)
  • name (string): taxon name for leaves; internal node may or may not have a name
  • edge: vector of Edges that the node is attached to; 1 if the node is a leaf, 2 if the node is the root, 3 otherwise, and potentially more if the node has a polytomy
  • hybrid (boolean): whether the node is a hybrid node (with 2 or more parents) or a tree node (with a single parent)
source
PhyloNetworks.PartitionType
Partition

Data structure for a collection of edges and (node) numbers. Fields:

  • cycle: vector of Int
  • edges: vector of Edge

process_biconnectedcomponents! uses edges to store the edges in a given biconnected component. (blocks = biconnected components partition edges, while blobs = 2-edge connected components partition nodes. They are in 1-to-1 correspondence in binary networks.) cycle is used to store the indices, in net.vec_node, of the block's articulation nodes (or root), with the first one being the entry node (or root) and other being exit articulation nodes. Note that a leaf is not an articulation node: it does not connect the pendent edge to another block.

`cycle` is a legacy name from SNaQ

For level-1 networks, SNaQ uses 1 partition for each maximum tree of cut-edges. edges stores the cut edges in this tree. cycle stores an integer for each cycle that the tree is adjacent to. This integer is the number of the unique hybrid node in this cycle.

See also entrynode_preindex, number_exitnodes

source
PhyloNetworks.QuartetTType

QuartetT{T}

Generic type for 4-taxon sets. Fields:

  • number: rank of the 4-taxon set
  • taxonnumber: static vector of 4 integers, assumed to be distinct and sorted
  • data: object of type T

For easier look-up, a unique mapping is used between the rank (number) of a 4-taxon set and its 4 taxa (see quartetrank and nchoose1234):

rank-1 = (t1-1) choose 1 + (t2-1) choose 2 + (t3-1) choose 3 + (t4-1) choose 4

examples

julia> nCk = PhyloNetworks.nchoose1234(5)
6×4 Matrix{Int64}:
 0   0   0  0
 1   0   0  0
 2   1   0  0
 3   3   1  0
 4   6   4  1
 5  10  10  5

julia> PhyloNetworks.QuartetT(1,3,4,6, [.92,.04,.04, 100], nCk)
4-taxon set number 8; taxon numbers: 1,3,4,6
data: [0.92, 0.04, 0.04, 100.0]
source
PhyloNetworks.TopologyConstraintType

Type for various topological constraints, such as:

  1. a set of taxa forming a clade in the major tree
  2. a set of individuals belonging to the same species
  3. a set of taxa forming a clade in any one of the displayed trees
source
PhyloNetworks.TopologyConstraintMethod
TopologyConstraint(type::UInt8, taxonnames::Vector{String}, net::HybridNetwork)

Create a topology constraint from user-given type, taxon names, and network. There are 3 types of constraints:

  • type 1: A set of tips that forms a species.
  • type 2: A set of tips that forms a clade in the major tree. Note that the root matters. Constraining a set of species to be an outgroup is equivalent to constraining the ingroup to form a clade.
  • type 3: A set of tips that forms a clade in any one of the displayed trees.

Note: currently, with type-1 constraints, hybridizations are prevented from coming into or going out of a species group.

examples

julia> net = readnewick("(((2a,2b),(((((1a,1b,1c),4),(5)#H1),(#H1,(6,7))))#H2),(#H2,10));");

julia> c_species1 = PhyloNetworks.TopologyConstraint(0x01, ["1a","1b","1c"], net)
Species constraint, on tips: 1a, 1b, 1c
 stem edge number 7
 crown node number -9

julia> c_species2 = PhyloNetworks.TopologyConstraint(0x01, ["2a","2b"], net)
Species constraint, on tips: 2a, 2b
 stem edge number 3
 crown node number -4

julia> nni!(net , net.edge[3], true, true, [c_species2]) === nothing # we get nothing: would break species 2
true

julia> # the following gives an error, because 4,5 do not form a clade in "net"
       # PhyloNetworks.TopologyConstraint(0x02, ["4","5"], net)

julia> c_clade145 = PhyloNetworks.TopologyConstraint(0x02, ["1a","1b","1c","4","5"], net)
Clade constraint, on tips: 1a, 1b, 1c, 4, 5
 stem edge number 12
 crown node number -7

julia> nni!(net , net.edge[12], true, true, [c_species1, c_clade145]) # we get nothing (failed NNI): would break the clade
source
PhyloNetworks.exitnodes_preindexType
exitnodes_preindex(biconnected_component::Partition)

Iterator over the preorder index of articulation exit nodes of a biconnected component, considering the network rooted (see number_exitnodes). Indices are indices in the network's preordering .vec_node.

source
Base.getindexFunction
getindex(obj::MatrixTopologicalOrder,
         d::Symbol,[ indtips, nonmissing])

Get submatrices of a MatrixTopologicalOrder object. In obj.V, row and/column i corresponds to the ith node in topological order. In contrast, in matrix obj[:tips] for example, row and/or column i corresponds to the ith tip when tips are listed in the same order as in the network's original .node vector.

Arguments:

  • obj::MatrixTopologicalOrder: the matrix from which to extract.
  • d: symbol specifying which sub-matrix to extract. Can be:
    • :tips columns and/or rows corresponding to the tips
    • :internalnodes columns and/or rows corresponding to the internal nodes Includes tips not listed in indtips or missing data according to nonmissing.
    • :tipsnodes columns corresponding to internal nodes, and row to tips (works only is indexation="b")
    • :all nodes, listed in topological order
  • indtips::Vector{Int}: optional argument precising a specific order for the tips (internal use).
  • nonmissing::BitVector: optional argument saying which tips have data (internal use). Tips with missing data are treated as internal nodes.
source
PhyloNetworks._getnodeheightsFunction
_getnodeheights(net::HybridNetwork, fixmissing::Bool,
                inconsistencyhandler::Function, checkpreorder::Bool=true)

Helper to determine time-consistency and calculate node heights (distance from the root), used by getnodeheights for example.

output: (isconsistent, nodes_distance_from_root)

Arguments:

  • fixmissing:

    • if false, any missing hybrid edge length will cause a warning, the network is not modified, and the best-case is assumed to determine time-consistency (as explained below)
    • if true, will attempt to find values for missing lengths of hybrid edges, if any, to make the network time-consistent, placing hybrid nodes as close to the root as possible (as this gives most chances to find a time-consistent assignment of all missing hybrid edge lengths). If there is a time-consistent assignment, then the network is modified (with missing hybrid edge lengths set to time-consistent values). Otherwise, the network is not modified.

    This option is passed to update_getnodeheights_hybrid! that handles 1 hybrid node at a time.

  • inconsistencyhandler: function to check & handle time-consistency as desired as a given hybrid node h, and to decide if the traversal should continue. It should take as input:

    • a vector of ≥1 candicate heights for h from parent edges with non-missing length, and
    • a vector of ≥0 heights of parent nodes whose child edge to h has no length
    • isconsistent: a boolean that is modified to false if the network is not time-consistent at h (unless an error is thrown anyway!).

    This handler function decides what to do if the candidate heights are not all equal (the network is time-inconsistent), and if the values to be assigned to missing edge lengths would be negative. Its output should be: (keepgoing_boolean, hybrid_node_height).

    Examples: timeinconsistency_error is conservative and throws an error in both cases. timeinconsistency_average is lenient: only throws warnings, but keeps going and returns γ-weighted average node heights

source
PhyloNetworks.addAlternativeHybridizations!Method
addAlternativeHybridizations!(net::HybridNetwork, BSe::DataFrame;
                              cutoff::Number=10, top::Int=3)

Modify the network net (the best estimated network) by adding some of the hybridizations present in the bootstrap networks. By default, it will only add hybrid edges with more than 10% bootstrap support (cutoff) and it will only include the top 3 hybridizations (top) sorted by bootstrap support.

The dataframe BSe is also modified. In the original BSe, supposedly obtained with hybridclades_support, hybrid edges that do not appear in the best network have a missing number. After hybrid edges from bootstrap networks are added, BSe is modified to include the edge numbers of the newly added hybrid edges. To distinguish hybrid edges present in the original network versus new edges, an extra column of true/false values is also added to BSe, named "alternative", with true for newly added edges absent from the original network.

The hybrid edges added to net are added as minor edges, to keep the underlying major tree topology.

example

julia> bootnet = readmultinewick(joinpath(dirname(pathof(PhyloNetworks)), "..","examples", "bootsnaq.out")); # vector of 10 networks

julia> bestnet = readnewick("((O,(E,#H7:::0.196):0.314):0.332,(((A)#H7:::0.804,B):10.0,(C,D):10.0):0.332);");

julia> BSn, BSe, BSc, BSgam, BSedgenum = hybridclades_support(bootnet, bestnet);

julia> BSe[1:6,[:edge,:hybrid_clade,:sister_clade,:BS_hybrid_edge]]
6×4 DataFrame
 Row │ edge     hybrid_clade  sister_clade  BS_hybrid_edge 
     │ Int64?   String        String        Float64        
─────┼─────────────────────────────────────────────────────
   1 │       7  H7            B                       33.0
   2 │       3  H7            E                       32.0
   3 │ missing  c_minus3      c_minus8                44.0
   4 │ missing  c_minus3      H7                      44.0
   5 │ missing  E             O                       12.0
   6 │ missing  c_minus6      c_minus8                 9.0

julia> PhyloNetworks.addAlternativeHybridizations!(bestnet, BSe)

julia> BSe[1:6,[:edge,:hybrid_clade,:sister_clade,:BS_hybrid_edge,:alternative]]
6×5 DataFrame
 Row │ edge     hybrid_clade  sister_clade  BS_hybrid_edge  alternative 
     │ Int64?   String        String        Float64         Bool        
─────┼──────────────────────────────────────────────────────────────────
   1 │       7  H7            B                       33.0        false
   2 │       3  H7            E                       32.0        false
   3 │      16  c_minus3      c_minus8                44.0         true
   4 │      19  c_minus3      H7                      44.0         true
   5 │      22  E             O                       12.0         true
   6 │ missing  c_minus6      c_minus8                 9.0        false

julia> # using PhyloPlots; plot(bestnet, edgelabel=BSe[:,[:edge,:BS_hybrid_edge]]);
source
PhyloNetworks.addHybridBetweenClades!Method
addHybridBetweenClades!(net::HybridNetwork, hybnum::Number, sisnum::Number)

Modify net by adding a minor hybrid edge from "donor" to "recipient", where "donor" is the major parent edge e1 of node number hybnum and "recipient" is the major parent edge e2 of node number sisnum. The new nodes are currently inserted at the middle of these parent edges.

If a hybrid edge from e1 to e2 would create a directed cycle in the network, then this hybrid cannot be added. In that case, the donor edge e1 is moved up if its parent is a hybrid node, to ensure that the sister clade to the new hybrid would be a desired (the descendant taxa from e1) and a new attempt is made to create a hybrid edge.

Output: number of the new hybrid edge, or nothing if the desired hybridization is not possible.

See also: addhybridedge! (used by this method) and directionalconflict to check that net would still be a DAG.

source
PhyloNetworks.addhybridedge!Function
addhybridedge!(
    net::HybridNetwork,
    edge1::Edge,
    edge2::Edge,
    hybridpartnernew::Bool,
    edgelength::Float64=-1.0,
    gamma::Float64=-1.0
)

Add hybridization to net coming from edge1 going into edge2. 2 new nodes and 3 new edges are created: edge1 are edge2 are both cut into 2 edges, and a new edge is created linking the 2 new "middle" nodes, pointing from edge1 to edge2. The new node in the middle of edge1 is a tree node. The new node in the middle of edge2 is a hybrid node. Its parent edges are the newly created hybrid edge (with γ = gamma, missing by default), and either the newly edge "above" edge2 if hybridpartnernew=true, or the old edge2 otherwise (which would reverse the direction of edge2 and others).

Should be called from the other method, which performs a bunch of checks. Updates containroot attributes for edges below the new hybrid node.

Output: new hybrid node (middle of the old edge2) and new hybrid edge.

examples

julia> net = readnewick("((S8,(((S1,(S5)#H1),(#H1,S6)))#H2),(#H2,S10));");

julia> hybnode, hybedge = PhyloNetworks.addhybridedge!(net, net.edge[13], net.edge[8], true, 0.0, 0.2)
(PhyloNetworks.Node:
 number:9
 name:H3
 hybrid node
 attached to 3 edges, numbered: 8 16 17
, PhyloNetworks.EdgeT{PhyloNetworks.Node}:
 number:17
 length:0.0
 minor hybrid edge with gamma=0.2
 attached to 2 node(s) (parent first): 8 9
)


julia> writenewick(net)
"((S8,(((S1,(S5)#H1),((#H1,S6))#H3:::0.8))#H2),(#H2,(S10,#H3:0.0::0.2)));"
source
PhyloNetworks.addhybridedge!Method
addhybridedge!(
    [rng::AbstractRNG,]
    net::HybridNetwork,
    nohybridladder::Bool,
    no3cycle::Bool,
    constraints::Vector{TopologyConstraint}=TopologyConstraint[];
    maxattempts::Int=10,
    fixroot::Bool=false
)

Randomly choose two edges in net then: add hybrid edge from edge 1 to edge 2 of length 0.01. The two halves of edge 1 (and of edge 2) have equal lengths. The hybrid partner edge (top half of edge 2, if fixroot is true) will point towards the newly-created node on the middle of the original edge 2, and have an inheritance γ randomly uniformly chosen in (0,0.5).

If the resulting network is a DAG, satisfies the constraint(s), does not contain any 3-cycle (if no3cycle=true), and does not have a hybrid ladder (if nohybridladder=true) then the proposal is successful: net is modified, and the function returns the newly created hybrid node and newly created hybrid edge.

If the resulting network is not acceptable, then a new set of edges is proposed (using a blacklist) until one is found acceptable, or until a maximum number of attempts have been made (maxattempts). If none of the attempted proposals are successful, nothing is returned (without causing an error).

After a pair of edges is picked, the "top" half of edge2 is proposed as the partner hybrid edge with probability 0.8 if fixroot is false, (to avoid changing the direction of edge2 with more than 50% chance) and with probability 1.0 if fixroot is true. If this choice does not work and if fixroot is false, the other half of edge2 is proposed as the partner hybrid edge. Note that choosing the "bottom" half of edge2 as the partner edge requires to flip the direction of edge 2, and to move the root accordingly (to the original child of edge2).

examples

julia> net = readnewick("((S1,(((S2,(S3)#H1),(#H1,S4)))#H2),(#H2,S5));");

julia> using Random

julia> Random.seed!(170);

julia> PhyloNetworks.addhybridedge!(net, true, true)
(PhyloNetworks.Node:
 number:9
 name:H3
 hybrid node
 attached to 3 edges, numbered: 5 16 17
, PhyloNetworks.EdgeT{PhyloNetworks.Node}:
 number:17
 length:0.01
 minor hybrid edge with gamma=0.32771460911632916
 attached to 2 node(s) (parent first): 8 9
)

julia> writenewick(net, round=true, digits=2)
"((S1,(((#H1,S4),((S2,(S3)#H1))#H3:::0.67))#H2),((#H2,S5),#H3:0.01::0.33));"
source
PhyloNetworks.addindividuals!Method
addindividuals!(net::HybridNetwork, species::AbstractString, individuals::Vector)

Add individuals to their species leaf as a star, and return the corresponding species constraint. nothing is returned in 2 cases:

  • if individuals contains only 1 individual, in which case the name of the leaf for that species is replaced by the name of its one individual representative
  • if species is not found in the network

Spaces in individuals' names are eliminated, see cleantaxonname.

Called by mapindividuals.

examples

julia> net = readnewick("(S8,(((S1,S4),(S5)#H1),(#H1,S6)));");

julia> PhyloNetworks.addindividuals!(net, "S1", ["S1A", "S1B", "S1C"])
Species constraint, on tips: S1A, S1B, S1C
 stem edge number 2
 crown node number 2

julia> writenewick(net) # 3 new nodes, S1 now internal: not a tip
"(S8,((((S1A,S1B,S1C)S1,S4),(S5)#H1),(#H1,S6)));"
source
PhyloNetworks.addleaf!Function
addleaf!(net::HybridNetwork, node::Node, leafname::String, edgelength::Float64=-1.0)
addleaf!(net::HybridNetwork, edge::Edge, leafname::String, edgelength::Float64=-1.0)

Add a new external edge between node or between the "middle" of edge and a newly-created leaf, of name leafname. By default, the new edge length is missing (-1).

output: newly created leaf node.

examples

julia> net = readnewick("((S1,(((S2,(S3)#H1),(#H1,S4)))#H2),(#H2,S5));");

julia> net.node[6].name # leaf S4
"S4"

julia> PhyloNetworks.addleaf!(net, net.node[6], "4a"); # adding leaf to a node

julia> writenewick(net, internallabel=true)
"((S1,(((S2,(S3)#H1),(#H1,(4a)S4)))#H2),(#H2,S5));"

julia> PhyloNetworks.addleaf!(net, net.node[6], "4b");

julia> writenewick(net, internallabel=true)
"((S1,(((S2,(S3)#H1),(#H1,(4a,4b)S4)))#H2),(#H2,S5));"
julia> net = readnewick("((S1,(((S2,(S3)#H1),(#H1,S4)))#H2),(#H2,S5));");

julia> [n.name for n in net.edge[7].node] # external edge to S4
2-element Vector{String}:
 "S4"
 ""  

julia> PhyloNetworks.addleaf!(net, net.edge[7], "4a"); # adding leaf to an edge

julia> writenewick(net, internallabel=true)
"((S1,(((S2,(S3)#H1),(#H1,(S4,4a))))#H2),(#H2,S5));"
source
PhyloNetworks.allowrootbelow!Method
allowrootbelow!(net::HybridNetwork)

Set containroot to true for each edge below the root node, then traverses net in preorder to update containroot of all edges (stopping at hybrid nodes): see the other methods. Assumes correct ischild1 edge field.

source
PhyloNetworks.allowrootbelow!Method
allowrootbelow!(e::Edge)
allowrootbelow!(n::Node, parent_edge_of_n::Edge)

Set containroot to true for edge e and all edges below, recursively. The traversal stops whenever a hybrid node is encountered: if the child of e is a hybrid node (that is, if e is a hybrid edge) or if n is a hybrid node, then the edges below e or n are not traversed.

source
PhyloNetworks.assignhybridnames!Method
assignhybridnames!(net)

Assign names to hybrid nodes in the network net. Hybrid nodes with an empty name field ("") are modified with a name that does not conflict with other hybrid names in the network. The preferred name is "H3" if the node number is 3 or -3, but an index other than 3 would be used if "H3" were the name of another node already.

If two hybrid nodes have non-empty and equal names, the name of one of them is changed and re-assigned as described above (with a warning).

source
PhyloNetworks.biconnectedcomponent_entrynodesFunction
biconnectedcomponent_entrynodes(net, bcc, preorder=true)

Array containing the entry node of the each biconnected component in bcc. bcc is supposed to contain the biconnected components as output by biconnectedcomponents, that is, an array of array of edges.

These entry nodes depend on the rooting (whereas the BCC only depend on the unrooted graph). They are either the root of the network or cut node (articulation points).

source
PhyloNetworks.biconnectedcomponent_exitnodesFunction
biconnectedcomponent_exitnodes(net, bcc, preorder=true)

Array containing an array of the exit node(s) of the each biconnected component in bcc. bcc is supposed to contain the biconnected components as output by biconnectedcomponents, that is, an array of array of edges.

These exit nodes depend on the rooting (whereas the BCC only depend on the unrooted graph). The degree of a blob is the number of exit nodes + 1 if the blob doesn't contain the root (its entry node is a cut node), or + 0 if the blob contains the root (which enters into the blob but isn't a cut node).

Warning (or positive side effect?): the edge .inte1 attribute is modified. It stores the index (in bcc) of the biconnected component that an edge belongs to. If an edge doesn't belong in any (e.g. if trivial blobs are ignored), then its .inte1 is set to -1.

source
PhyloNetworks.blobinfoFunction
blobinfo(network, ignoreTrivial=true)

Calculate the biconnected components (blobs) using function biconnectedcomponents then:

  • set node field booln4 to true at the root of each non-trivial blob (and at the network root), false otherwise. (a better name for the field would be something like "isBlobRoot".)
  • output:
    1. array of nodes that are the roots of each non-trivial blob, and the network root. If the root of the full network is not part of a non-trivial blob, a corresponding blob is added to the list.
    2. array of arrays: for each non-trivial blob, array of major hybrid edges in that blob.
    3. array of arrays: same as #2 but for minor hybrid edges, with hybrids listed in the same order, for each blob.

Blobs are ordered in reverse topological ordering (aka post order). If ignoreTrivial is true, trivial components are ignored.

keyword argument: checkpreorder, true by default. If false, the ischild1 edge field and the net.vec_node network field are supposed to be correct.

warning: see biconnectedcomponents for node attributes modified during the algorithm.

source
PhyloNetworks.breakedge!Method
breakedge!(edge::Edge, net::HybridNetwork)

Break an edge into 2 edges, each of length half that of original edge, creating a new node of degree 2. Useful to root network along an edge. Return the new node and the new edge, which is the "top" half of the original starting edge. These new node & edge are pushed last in net.node and net.edge.

If the starting edge was:

n1  --edge-->  n2

then we get this:

n1  --newedge-->  newnode  --edge-->  n2

ischild1 and containroot are updated, but not internal fields (e.g. not those used by SNaQ for level-1 networks).

examples

julia> net = readnewick("(((S8,S9),((((S1,S4),(S5)#H1),(#H1,(S6,S7))))#H2),(#H2,S10));");

julia> length(net.node)
19

julia> net.edge[4] # edge 4 goes from node -8 to 3
PhyloNetworks.EdgeT{PhyloNetworks.Node}:
 number:4
 length:-1.0
 attached to 2 node(s) (parent first): -8 3


julia> newnode, newedge = PhyloNetworks.breakedge!(net.edge[4], net);

julia> length(net.node) # one more than before
20

julia> newedge # new edge 21 goes from node -8 and 11 (new)
PhyloNetworks.EdgeT{PhyloNetworks.Node}:
 number:21
 length:-1.0
 attached to 2 node(s) (parent first): -8 11


julia> net.edge[4] # original edge 4 now goes from node 11 (new) to 3
PhyloNetworks.EdgeT{PhyloNetworks.Node}:
 number:4
 length:-1.0
 attached to 2 node(s) (parent first): 11 3


julia> writenewick(net) # note extra pair of parentheses around S1
"(((S8,S9),((((S4,(S1)),(S5)#H1),(#H1,(S6,S7))))#H2),(#H2,S10));"

See also: fuseedgesat!

source
PhyloNetworks.checkNumHybEdges!Method
checkNumHybEdges!(net)

Check for consistency between hybrid-related attributes in the network:

  • for each hybrid node: 2 or more hybrid edges
  • exception: allows for a leaf to be attached to a single hybrid edge
  • exactly 2 incoming parent hybrid edges

Run after storeHybrids!.

source
PhyloNetworks.check_nonmissing_nonnegative_edgelengthsFunction
check_nonmissing_nonnegative_edgelengths(net, str="")

Throw an Exception if net has undefined edge lengths (coded as -1.0) or negative edge lengths. The error message indicates the number of the offending edge(s), followed by str.

source
PhyloNetworks.checkspeciesnetwork!Method
checkspeciesnetwork!(network::HybridNetwork,
                     constraints::Vector{TopologyConstraint})

Check that the network satisfies a number of requirements:

  • no polytomies, other than at species constraints: throws an error otherwise
  • no unnecessary nodes: fuse edges at nodes with degree two, including at the root (hence the bang: net may be modified)
  • topology constraints are met.

Output: true if all is good, false if one or more clades are violated.

source
PhyloNetworks.cleantaxonnameMethod
cleantaxonname(taxonname::AbstractString)

Return a String with leading and trailing spaces removed, and interior spaces replaced by underscores: good for using as tip names in a network without causing future error when reading the newick description of the network.

source
PhyloNetworks.constraintviolatedMethod
constraintviolated(network::HybridNetwork,
                   constraints::Vector{TopologyConstraint})

True if network violates one (or more) of the constraints of type 1 (individuals in a species group) or type 2 (must be clades in the major tree). Warning: constraints of type 3 are not implemented. See TopologyConstraint for constraint types.

source
PhyloNetworks.deleteEdge!Method
deleteEdge!(net::HybridNetwork,  e::Edge; part=true)

Delete edge e from net.edge and update net.numedges. If part is true, update the network's partition field.

Warning: if part is true (the default), then net is assumed to be of level-1, with valid internal fields e.inte1 (to track which cycle e may be in) and valid net.partition, which then gets updated.

source
PhyloNetworks.deleteNode!Method
deleteNode!(net::HybridNetwork, n::Node)

Delete node n from a network, i.e. removes it from net.node, and from net.hybrid or net.leaf as appropriate. Update attributes numnodes, numtaxa, numhybrids.

Warning: if the root is deleted, the new root is arbitrarily set to the first node in the list. This is intentional to save time because this function is used frequently in snaq!, which handles semi-directed (unrooted) networks.

source
PhyloNetworks.deletehybridedge!Function
deletehybridedge!(net::HybridNetwork, edge::Edge,
                  nofuse=false, unroot=false,
                  multgammas=false, simplify=true, keeporiginalroot=false)

Delete a hybrid edge from net and return the network. The network does not have to be of level 1 and may contain polytomies, although each hybrid node must have exactly 2 parents. Branch lengths are updated, allowing for missing values.

If nofuse is false, when edge is removed, its child (hybrid) node is removed and its partner hybrid edge is removed. Its child edge is retained (below the hybrid node), fused with the former partner, with new length: old length + length of edge's old partner. Any 2-cycle is simplified into a single edge, unless simplify is false.

If nofuse is true, edges with descendant leaves are kept as is, and are not fused. Nodes are retained during edge removal, provided that they have at least one descendant leaf. The hybrid edge that is partner to edge becomes a tree edge, but has its γ value unchanged (it is not set to 1), since it is not merged with its child edge after removal of the reticulation. Also, 2-cycles are not simplified if nofuse is true. That is, if we get 2 hybrid edges both from the same parent to the same child, these hybrid edges are retained without being fused into a single tree edge.

If unroot is false and if the root is up for deletion during the process, it will be kept if it's of degree 2 or more. A root node of degree 1 will be deleted unless keeporiginalroot is true.

If multgammas is true: inheritance weights are kept by multiplying together the inheritance γ's of edges that are merged. For example, if there is a hybrid ladder, the partner hybrid edge remains a hybrid edge (with a new partner), and its γ is the product of the two hybrid edges that have been fused. So it won't add up to 1 with its new partner's γ.

If keeporiginalroot is true, a root of degree one will not be deleted.

Warnings:

  • containroot is updated, but this requires correct ischild1 fields
  • if the parent of edge is the root and if nofuse is false, the root is moved to keep the network unrooted with a root of degree two.
  • does not update attributes needed for snaq! (like inte1, edge.z, edge.y etc.)
source
PhyloNetworks.descendantsFunction
descendants(edge::Edge, internal::Bool=false)

Return the node numbers of the descendants of a given edge: all descendant nodes if internal is true (internal nodes and tips), or descendant tips only otherwise (defaults).

edge should belong in a rooted network for which ischild1 is up-to-date. Run directedges! beforehand. This is very important, otherwise one might enter an infinite loop, and the function does not test for this.

Examples

julia> net5 = "(A,((B,#H1),(((C,(E)#H2),(#H2,F)),(D)#H1)));" |> readnewick |> directedges! ;

julia> PhyloNetworks.descendants(net5.edge[12], true) # descendants of 12th edge: all of them
7-element Vector{Int64}:
 -6
 -7
  4
  6
  5
 -9
  7

julia> PhyloNetworks.descendants(net5.edge[12]) # descendant leaves only
3-element Vector{Int64}:
 4
 5
 7
source
PhyloNetworks.directionalconflictMethod
directionalconflict(parent::Node, edge::Edge, hybridpartnernew::Bool)

Check if creating a hybrid edge down of parent node into the middle of edge would create a directed cycle in net, i.e. not a DAG. The proposed hybrid would go in the direction of edge down its child node if hybridpartnernew is true. Otherwise, both halves of edge would have their direction reversed, for the hybrid to go towards the original parent node of edge. Does not modify the network.

Output: true if a conflict would arise (non-DAG), false if no conflict.

source
PhyloNetworks.displayednetworks!Function
displayednetworks!(net::HybridNetwork, node::Node, keepNode=false,
                   unroot=false, multgammas=false, keeporiginalroot=false)

Extracts the two networks that simplify a given network at a given hybrid node: deleting either one or the other parent hybrid edge. If nofuse is true, the original edges (and nodes) are kept in both networks, provided that they have one or more descendant leaves. If unroot is true, the root will be deleted if it becomes of degree 2. If keeporiginalroot is true, the root is retained even if it is of degree 1.

  • the original network is modified: the minor edge removed.
  • returns one HybridNetwork object: the network with the major edge removed
source
PhyloNetworks.edgerelationMethod
edgerelation(e::Edge, node::Node, origin::Edge)

Return a symbol:

  • :origin if e is equal to origin, and otherwise:
  • :parent if e is a parent of node,
  • :child if e is a child of node

using the ischild1 attribute of edges. Useful when e iterates over all edges adjacent to node and when origin is one of the edges adjacent to node, to known the order in which these edges come.

example:

labs = [edgerelation(e, u, uv) for e in u.edge] # assuming u is a node of edge uv
parentindex = findfirst(isequal(:parent), labs) # could be 'nothing' if no parent
childindices = findall( isequal(:child), labs)  # vector. could be empty
source
PhyloNetworks.entrynode_preindexMethod

entrynodepreindex(biconnectedcomponent::Partition)

Preorder index of the entry node of a biconnected component, considering the network rooted. It is an articulation node, except if it is the network's root.

source
PhyloNetworks.fliphybrid!Function
fliphybrid!(
    net::HybridNetwork,
    hybridnode::Node,
    minor::Bool=true,
    nohybridladder::Bool=false,
    constraints::Vector{TopologyConstraint}=TopologyConstraint[]
)

Flip the direction of a single hybrid edge: the minor parent edge of hybridnode by default, or the major parent edge if minor is false. The parent node of the hybrid edge becomes the new hybrid node. The former hybrid edge partner is converted to a tree edge (with γ=1), and hybridnode becomes a tree node.

For the flip to be admissible, the new network must be a semi-directed phylogenetic network: with a root such that the rooted version is a DAG. If nohybridladder is false (default), the flip may create a hybrid ladder If nohybridladder is true and if the flip would create a hybrid ladder, then the flip is not admissible. A hybrid ladder is when a hybrid child of another hybrid.

The new hybrid partner is an edge adjacent to the new hybrid node, such that the flip is admissible (so it must be a tree edge). The flipped edge retains its original γ. The new hybrid edge is assigned inheritance 1-γ.

Output: (newhybridnode, flippededge, oldchildedge) if the flip is admissible, nothing otherwise.

The network is unchanged if the flip is not admissible. If the flip is admissible, the root position may be modified, and the direction of tree edges (via ischild1) is modified accordingly. If the root needs to be modified, then the new root is set to the old hybrid node.

The index of the new hybrid node in net.hybrid is equal to that of the old hybridnode.

Warning: Undoing this move may not recover the original root if the root position was modified.

source
PhyloNetworks.fliphybrid!Method
fliphybrid!(
    [rng::AbstractRNG,]
    net::HybridNetwork,
    minor::Bool,
    nohybridladder::Bool=false,
    constraints::Vector{TopologyConstraint}=TopologyConstraint[]
)

Cycle through hybrid nodes in random order until an admissible flip is found. At this hybrid node, flip the indicated hybrid parent edge (minor or major).

If an admissible flip is found, return the tuple: newhybridnode, flippededge, oldchildedge. Otherwise, return nothing.

The flip can be undone with fliphybrid!(net, newhybridnode, minor, constraints), or fliphybrid!(net, newhybridnode, !flippededge.ismajor, constraints) more generally, such as if the flipped edge had its γ modified after the original flip.

Warnings

  • if the root needed to be reset and if the original root was of degree 2, then a node of degree 2 remains in the modified network.
  • undoing the flip may not recover the original root in case the root position was modified during the original flip.
source
PhyloNetworks.fuseedgesat!Function
fuseedgesat!(i::Integer,net::HybridNetwork, multgammas::Bool=false)

Removes ith node in net.node, if it is of degree 2. The parent and child edges of this node are fused. If either of the edges is hybrid, the hybrid edge is retained. Otherwise, the edge with the lower edge number is retained.

Reverts the action of breakedge!.

returns the fused edge.

source
PhyloNetworks.getTipSubmatrixMethod
getTipSubmatrix(M, net; indexation=:both)

Extract submatrix of M, with rows and/or columns corresponding to tips in the network, ordered like in net.leaf. In M, rows and/or columns are assumed ordered as in net.vec_node.

indexation: one of :rows, :cols or :both: are nodes numbers indexed in the matrix by rows, by columns, or both? Subsetting is taken accordingly.

source
PhyloNetworks.getconnectingedgeMethod
getconnectingedge(node1::Node, node2::Node)

Edge shared by (or connecting) node1 and node2, that is: edge incident to both nodes. An error is thrown if the 2 nodes are not connected.

See also isconnected

source
PhyloNetworks.hardwiredclusterdistance_unrootedMethod
hardwiredclusterdistance_unrooted(net1::HybridNetwork, net2::HybridNetwork)

Miminum hardwired cluster dissimilarity between the two networks, considered as unrooted (or semi-directed). This dissimilarity is defined as the minimum rooted distance, over all root positions that are compatible with the direction of hybrid edges. Called by hardwiredclusterdistance.

To avoid repeating identical clusters, all degree-2 nodes are deleted before starting the comparison. Since rooting the network at a leaf creates a root node of degree 2 and an extra cluster, leaves are excluded from possible rooting positions.

source
PhyloNetworks.hybrid3cycleMethod
hybrid3cycle(edge1::Edge, edge2::Edge)

Check if proposed hybrid edge from edge1 into edge2 would create a 3 cycle, that is, if edge1 and edge2 have a node in common. (This move cannot create a 2-cycles because new nodes would be created in the middle of edges 1 and 2.)

source
PhyloNetworks.hybridEdgesMethod
hybridEdges(node::Node, e::Edge)

Return the 2 edges connected to node other than e, in the same order as node.edge, except that e absent from the list.

Despite what the name suggest, node need not be a hybrid node! node is assumed to have 3 edges, though.

source
PhyloNetworks.hybridEdgesMethod
hybridEdges(node::Node)

Return the 3 edges attached to node in a specific order [e1,e2,e3]. Warning: assume a level-1 network with up-to-date node fields booln1 (tracking whether the node is incident to a hybrid edge and edge) and field intn1 (tracking the number given to the cycle in which the node might be).

If node is a hybrid node:

  • e1 is the major hybrid parent edge of node
  • e2 is the minor hybrid parent edge
  • e3 is the tree edge, child of node.

If node is a tree node parent of one child hybrid edge:

  • e1 is the hybrid edge, child of node
  • e2 is the tree edge that belongs to the cycle created by e1
  • e3 is the other tree edge attached to node (not in a cycle)

Otherwise:

  • e3 is an external edge from node to a leaf, if one exists.
source
PhyloNetworks.inheritanceweightMethod
inheritanceweight(tree::HybridNetwork)

Return the log inheritance weight of a network or tree (as provided by displayedtrees with nofuse = true for instance). For a tree displayed in a network, its inheritance weight is the log of the product of γ's of all edges retained in the tree. To avoid underflow, the log is calculated: i.e. sum of log(γ) across retained edges.

If any edge has a negative γ, it is assumed to mean that its γ is missing, and the function returns missing.

Example

julia> net = readnewick("(((A,(B)#H1:::0.9),(C,#H1:::0.1)),D);");

julia> trees = displayedtrees(net,0.0; nofuse=true);

julia> PhyloNetworks.inheritanceweight.(trees)
2-element Vector{Float64}:
 -0.105361
 -2.30259 
source
PhyloNetworks.initializeweightsfromleaves!Method
initializeweightsfromleaves!(w, net, tips, stateset, criterion)

Modify weight in w: to Inf for w[n, i] if the "tips" data has a state different from the lineage state of index i at node number n. Assumes that w was initialized to 0 for the leaves.

criterion: should be one of :softwired, :parental or :hardwired.

  • softwired parsimony: lineage states are in this order: ∅,{1},{2},{3},...,{nstates}
source
PhyloNetworks.isdescendant_undirectedMethod
isdescendant_undirected(des:Node, ancestor::Node, parentedge)

Return true if des is a strict descendant of ancestor when starting from edge parentedge and going towards ancestor onward, regardless of the field ischild1 of tree edges; false otherwise.

This is useful to know how descendant relationships would change as a result of reverting the direction of a tree edge, without actually modifying the direction (ischild1) of any edge.

parentedge should be connected to ancestor (not checked). The direction of hybrid edges is respected (via ischild1), that is, the traversal does not go from the child to the parent of a hybrid edge.

source
PhyloNetworks.ladderpartitionMethod
ladderpartition(tree::HybridNetwork)

For each node in tree, calculate the clade below each child edge of the node, and each clade moving up the "ladder" from the node to the root. The output is a tuple of 2 vectors (node) of vector (clade) of vectors (taxon in clade): below,above. More specifically, for node number n, below[n] is generally of vector of 2 clades: one for the left child and one for the right child of the node (unless the node is of degree 2 or is a polytomy). above[n] contains the grade of clades above node number n.

Warning: assumes that

  • the network is a tree (this is checked)
  • node numbers and edge numbers can be used as indices, that is, be all distinct, positive, covering exactly 1:#nodes and 1:#edges (this is checked for nodes).
  • edges are corrected directed (ischild1 is up-to-date) and nodes have been pre-ordered already (field vec_node up-to-date).

examples

julia> tree = readnewick("(O,A,((B1,B2),(E,(C,D))));");

julia> PhyloNetworks.resetnodenumbers!(tree; checkpreorder=true, type=:postorder)

julia> printnodes(tree)
node leaf  hybrid name i_cycle edges'numbers
1    true  false  O    -1      1   
2    true  false  A    -1      2   
3    true  false  B1   -1      3   
4    true  false  B2   -1      4   
8    false false       -1      3    4    5   
5    true  false  E    -1      6   
6    true  false  C    -1      7   
7    true  false  D    -1      8   
9    false false       -1      7    8    9   
10   false false       -1      6    9    10  
11   false false       -1      5    10   11  
12   false false       -1      1    2    11  

julia> below, above = PhyloNetworks.ladderpartition(tree);

julia> below
12-element Vector{Vector{Vector{Int64}}}:
 [[1]]                      
 [[2]]                      
 [[3]]                      
 [[4]]                      
 [[5]]                      
 [[6]]                      
 [[7]]                      
 [[3], [4]]                 
 [[6], [7]]                 
 [[5], [6, 7]]              
 [[3, 4], [5, 6, 7]]        
 [[1], [2], [3, 4, 5, 6, 7]]

julia> for n in 8:12
         println("clades below node ", n, ": ", join(below[n], " "))
       end
clades below node 8: [3] [4]
clades below node 9: [6] [7]
clades below node 10: [5] [6, 7]
clades below node 11: [3, 4] [5, 6, 7]
clades below node 12: [1] [2] [3, 4, 5, 6, 7]

julia> above[8:12] # clades sister to and above nodes 8 through 12:
5-element Vector{Vector{Vector{Int64}}}:
 [[5, 6, 7], [1], [2]]
 [[5], [3, 4], [1], [2]]
 [[3, 4], [1], [2]]     
 [[1], [2]]             
 []                     
source
PhyloNetworks.leaststableancestorFunction
leaststableancestor(net, preorder=true, preprocess=true)

Tuple (lsa, lsa_index) where lsa is the least stable ancestor (LSA) node in net, and lsa_index is the index of lsa in net.vec_node. The LSA the lowest node n with the following property: any path between any leaf and the root must go through n. All such nodes with this property are ancestral to the LSA (and therefore must have an index that is lower or equal to lsa_index).

Exception: if the network has a single leaf, the output lsa is the leaf's parent node, to maintain one external edge between the root and the leaf.

Arguments:

  • preorder to direct edges according to the root and store a pre-ordering of nodes in net.vec_node, if not done earlier on the network (e.g. after a topology modification)
  • preprocess to recalculate & store the biconnected components with process_biconnectedcomponents!, assuming the preordering was done

Warning: uses biconnectedcomponents and biconnectedcomponent_exitnodes, therefore share the same caveats regarding the use of field .intn1 and .intn2 for nodes; boole2 and .inte1 for edges. As a positivie side effect, the biconnected components can be recovered via the edges' .inte1 field –including the trivial blobs (cut edges).

See also: deleteaboveLSA!

source
PhyloNetworks.majoredgelengthMethod
majoredgelength(net::HybridNetwork)

Generate vector of edge lengths of major net edges organized in the same order as the edge matrix created via majoredgematrix. Considers values of -1.0 as missing values, recognized as NA in R. Output: vector allowing for missing values.

Assume vec_node was updated, to list nodes in pre-order.

Examples

julia> net = readnewick("(((A:3.1,(B:0.2)#H1:0.3::0.9),(C,#H1:0.3::0.1):1.1),D:0.7);");

julia> directedges!(net); preorder!(net);

julia> PhyloNetworks.majoredgelength(net)
8-element Vector{Union{Missing, Float64}}:
  missing
 0.7     
  missing
 1.1     
  missing
 3.1     
 0.3     
 0.2     
source
PhyloNetworks.majoredgematrixMethod
majoredgematrix(net::HybridNetwork)

Matrix of major edges from net where edge[i,1] is the number of the parent node of edge i and edge[i,2] is the number of the child node of edge i. Assume vec_node was updated, to list nodes in pre-order.

Examples

julia> net = readnewick("(A,(B,(C,D)));");

julia> PhyloNetworks.resetnodenumbers!(net);

julia> PhyloNetworks.majoredgematrix(net)
6×2 Matrix{Int64}:
 5  1
 5  6
 6  2
 6  7
 7  3
 7  4
source
PhyloNetworks.makemissing!Method
makemissing!(x::AbstractVector)

Turn to missing any element of x exactly equal to -1.0. Used for branch lengths and γs. x needs to accept missing values. If not, this can be done with allowmissing(x).

source
PhyloNetworks.mapindividualsMethod
mapindividuals(net::HybridNetwork, mappingFile::String)

Return a network expanded from net, where species listed in the mapping file are replaced by individuals mapped to that species. If a species has only 1 individual, the name of the leaf for that species is replaced by the name of its one individual representative. If a species has 2 or more individuals, the leaf for that species is expanded into a "star" (polytomy if 3 or more individuals) with a tip for each individual. If a species is in the network but not listed in the mapping file, the tip for that species is left as is. Species listed in the mapping file but not present in the network are ignored.

The mapping file should be readable by CSV.File and contain two columns: one for the species names and one for the individual (or allele) names. fixit: make this function more flexible by accepting column names

Output: individual-level network and vector of species constraint(s).

examples

julia> species_net = readnewick("(((S8,S9),((((S1,S4),(S5)#H1),(#H1,(S6,S7))))#H2),(#H2,S10));");

julia> filename = joinpath(dirname(Base.find_package("PhyloNetworks")), "..", "examples", "mappingIndividuals.csv");

julia> filename |> read |> String |> print # to see what the mapping file contains
species,individual
S1,S1A
S1,S1B
S1,S1C

julia> individual_net, species_constraints = PhyloNetworks.mapindividuals(species_net, filename);

julia> writenewick(individual_net, internallabel=true)
"(((S8,S9),(((((S1A,S1B,S1C)S1,S4),(S5)#H1),(#H1,(S6,S7))))#H2),(#H2,S10));"

julia> species_constraints
1-element Vector{PhyloNetworks.TopologyConstraint}:
 Species constraint, on tips: S1A, S1B, S1C
 stem edge number 4
 crown node number 3
source
PhyloNetworks.maxParsimonyNetMethod
maxParsimonyNet(T::HybridNetwork, df::DataFrame)
feature to be re-implemented

This function has been disabled. It will be re-implemented, without the level-1 restriction. Please use version 0.16 of PhyloNetworks to access this older functionality, until a better one is made available.

Search for the most parsimonious network (or tree). A level-1 network is assumed. df should be a data frame containing the species names in column 1, or in a column named species or taxon. Trait data are assumed to be in all other columns. The search starts from topology T, which can be a tree or a network with no more than hmax hybrid nodes (see optional arguments below for hmax).

Output:

  • estimated network in file .out (also in .log): best network overall and list of networks from each individual run.
  • if any error occurred, file .err provides information (seed) to reproduce the error.

Optional arguments include

  • hmax: maximum number of hybridizations allowed (default 1)
  • runs: number of starting points for the search (default 10); each starting point is T with probability probST=0.3 or a modification of T otherwise (using a NNI move, or a hybrid edge direction change)
  • Nfail: number of failures (proposed networks with equal or worse score) before the search is aborted. 75 by default: this is quite small, which is okay for a first trial. Larger values are recommended.
  • outgroup: outgroup taxon. It can be a taxon name (String) or Node number (Integer). If none provided, or if the outgroup conflicts the starting topology, the function returns an error
  • filename: root name for the output files. Default is "mp". If empty (""), files are not created, progress log goes to the screen only (standard out).
  • seed: seed to replicate a given search
  • criterion: parsimony score could be hardwired, softwired (default) or parental. Currently, only softwired is implemented

References

  1. Leo Van Iersel, Mark Jones, Celine Scornavacca (2017). Improved Maximum Parsimony Models for Phylogenetic Networks, Systematic Biology, (https://doi.org/10.1093/sysbio/syx094).

  2. Fischer, M., van Iersel, L., Kelk, S., Scornavacca, C. (2015). On computing the Maximum Parsimony score of a phylogenetic network. SIAM J. Discrete Math., 29(1):559-585.

source
PhyloNetworks.minorreticulationgammaMethod
minorreticulationgamma(net::HybridNetwork)

Vector of minor edge gammas (inheritance probabilities) organized in the same order as in the matrix created via minorreticulationmatrix. Considers values of -1.0 as missing values, recognized as NA in R. Output: vector allowing for missing values.

Examples

julia> net = readnewick("(((A,(B)#H1:::0.9),(C,#H1:::0.1)),D);");

julia> PhyloNetworks.minorreticulationgamma(net)
1-element Vector{Union{Float64, Missings.Missing}}:
 0.1
source
PhyloNetworks.minorreticulationlengthMethod
minorreticulationlength(net::HybridNetwork)

Vector of lengths for the minor hybrid edges, organized in the same order as in the matrix created via minorreticulationmatrix. Replace values of -1.0 with missing values recognized by R. Output: vector allowing for missing values.

Examples

julia> net = readnewick("(((A:3.1,(B:0.2)#H1:0.4::0.9),(C,#H1:0.3::0.1):1.1),D:0.7);");

julia> PhyloNetworks.minorreticulationlength(net)
1-element Vector{Union{Missing, Float64}}:
 0.3
source
PhyloNetworks.minorreticulationmatrixMethod
minorreticulationmatrix(net::HybridNetwork)

Matrix of integers, representing the minor hybrid edges in net. edge[i,1] is the number of the parent node of the ith minor hybrid edge, and edge[i,2] is the number of its child node. Node numbers may be negative, unless they were modified by resetnodenumbers!. Assumes correct ischild1 fields.

Examples

julia> net = readnewick("(((A,(B)#H1:::0.9),(C,#H1:::0.1)),D);");
julia> PhyloNetworks.minorreticulationmatrix(net)
1×2 Matrix{Int64}:
 -6  3
source
PhyloNetworks.moveroot!Method
moveroot!(
    [rng::AbstractRNG,]
    net::HybridNetwork,
    constraints::Vector{TopologyConstraint}=TopologyConstraint[]
)

Move the root to a randomly chosen non-leaf node that is different from the current root, and not within a constraint clade or species. Output: true if successul, nothing otherwise.

source
PhyloNetworks.nchoose1234Method
nchoose1234(nmax)

nmax+1 x 4 matrix containing the binomial coefficient "n choose k" in row n+1 and column k. In other words, M[i,k] gives "i-1 choose k". It is useful to store these values and look them up to rank (a large number of) 4-taxon sets: see quartetrank.

source
PhyloNetworks.nj!Function
nj!(D::Matrix{Float64}, names::AbstractVector{<:AbstractString}=String[];
    force_nonnegative_edges::Bool=false)

Construct a phylogenetic tree from the input distance matrix and vector of names (as strings), using the Neighbour-Joinging algorithm (Satou & Nei 1987). The order of the names argument should match that of the row (and column) of the matrix D. With force_nonnegative_edges being true, any negative edge length is changed to 0.0 (with a message).

Warning: D is modified.

source
PhyloNetworks.nnimaxMethod
nnimax(e::Edge)

Return the number of NNI moves around edge e, assuming that the network is semi-directed. Return 0 if e is not internal, and more generally if either node attached to e does not have 3 edges.

Output: UInt8 (e.g. 0x02)

source
PhyloNetworks.norootbelow!Method
norootbelow!(e::Edge)

Set containroot to false for edge e and all edges below, recursively. The traversal stops if e.containroot is already false, assuming that containroot is already false all the way below down that edge.

source
PhyloNetworks.number_exitnodesMethod
number_exitnodes(biconnected_component::Partition)

Number of articulation exit nodes of a biconnected component, considering the network rooted. An articulation node is such that if removed, the network is disconnected. An articulation exit node is a node in the block that is incident to a different block below.

Note that leaves are not considered articulation exit nodes because they are not articulation nodes. So external (pendent) edges are trivial blobs with 0 (non-internal) exit nodes. If the network is rooted at a leaf, there may be some pathological behaviors depending on the downstream task (e.g. this is checked for by [leaststableancestor])(@ref)).

source
PhyloNetworks.pairwisetaxondistance_gradientMethod
pairwisetaxondistance_gradient(net; checkEdgeNumber=true, nodeAges=[])

3-dim array: gradient of pairwise distances between all nodes. (internal and leaves); gradient with respect to edge lengths if nodeAges is empty; with respect to node ages otherwise. Assume correct net.vec_node (preorder). This gradient depends on the network's topology and γ's only, not on branch lengths or node ages (distances are linear in either).

WARNING: edge numbers need to range between 1 and #edges.

source
PhyloNetworks.parsenewick_edgedata!Method
parsenewick_edgedata!(s::IO, edge, numberOfLeftParentheses::Array{Int,1})

Helper function for readnewick_subtree!. Modifies e according to the specified edge length and gamma values in the tree topology. Advances the stream s past any existing edge data. Edges in a topology may optionally be followed by ":edgeLen:bootstrap:gamma" where edgeLen, bootstrap, and gamma are decimal values. Nexus-style comments [&...], if any, are ignored.

source
PhyloNetworks.parsenewick_getfloat!Method
parsenewick_getfloat!(s::IO, int, numLeft::Array{Int,1})

Helper function for parsenewick_edgedata!. Read a single floating point edge data value in a tree topology. Ignore (and skip) nexus-style comments before & after the value (see readnexus_comment).

Return -1.0 if no value exists before the next colon, return the value as a float otherwise. Modifies s by advancing past the next colon character. Only call this function to read a value when you know a numerical value exists!

source
PhyloNetworks.parsenewick_hybridnode!Method
parsenewick_hybridnode!(node, parentNode, hybridName, net, hybrids)

Helper function for readnewick_subtree!. Create the parent edge for node. Return this edge, and the hybrid node retained (node or its clone in the newick string). Insert new edge and appropriate node into net and hybrids accordingly. Handles any type of given hybrid node. Called after a # has been found in a tree topology.

source
PhyloNetworks.parsenewick_remainingsubtree!Method
parsenewick_remainingsubtree!(s::IO, numLeft, net, hybrids)

Create internal node. Helper for readnewick_subtree!, which creates the parent edge of the node created by parsenewick_remainingsubtree!: readnewick_subtree! calls parsenewick_remainingsubtree!, and vice versa. Called once a ( has been read in a tree topology and reads until the corresponding ) has been found. This function performs the recursive step for readnewick_subtree!. Advances s past the subtree, adds discovered nodes and edges to net, and hybrids.

Does not read the node name and the edge information of the subtree root: this is done by readnewick_subtree!

source
PhyloNetworks.parsenewick_treenode!Method
parsenewick_treenode!(node, parentNode, net)

Helper function for readnewick_subtree!. Insert the input tree node and associated edge (created here) into net.

source
PhyloNetworks.parsimonyGF_bottomup!Method
parsimonyGF_bottomup!(node, blobroot, nchar, w, scores,
                     costmatrix1, costmatrix2)

Compute the MP scores (one for each assignment of the blob root state) given the descendants of a blob, conditional on the states at predefined parents of hybrids in the blobs (one parent per hybrid) as described in

Leo Van Iersel, Mark Jones, Celine Scornavacca (2017). Improved Maximum Parsimony Models for Phylogenetic Networks, Systematic Biology, (https://doi.org/10.1093/sysbio/syx094).

Assumes a set of state guesses, ie correct initialization of w for predefined hybrid parents, and correct boole2 field for the children edges of these predefined parents. boole2 is true for edges that are cut.

The field booln4 is used to know which nodes are at the root of a blob. The field ischild1 is used (and assumed correct). Field intn1 is assumed to store the # of detached parents (with guessed states)

  • nchar: number of characters considered at internal lineages. For softwired parsimony, this is # states + 1, because characters at internal nodes are ∅, {1}, {2}, etc. For parental parsimony, this is 2^#states -1, because characters are all sets on {1,2,...} except for the empty set ∅.
  • costmatrix1[i,j] and costmatrix2[k][i,j]: 2d array and vector of 2d arrays containing the cost of going to character j starting from character i when the end node has a single parent, or the cost of a child node having character j when its parents have characters k and i. These cost matrices are pre-computed depending on the parsimony criterion (softwired, hardwired, parental etc.)

used by parsimonyGF.

source
PhyloNetworks.parsimonyfitchMethod
parsimonyfitch(net, tipdata)

Calculate the most parsimonious (MP) score of a network given a discrete character at the tips. The softwired parsimony concept is used: where the number of state transitions is minimized over all trees displayed in the network. Tip data can be given in a data frame, in which case the taxon names are to appear in column 1 or in a column named "taxon" or "species", and trait values are to appear in column 2 or in a column named "trait". Alternatively, tip data can be given as a dictionary taxon => trait.

also return the union of all optimized character states at each internal node as obtained by Fitch algorithm, where the union is taken over displayed trees with the MP score.

source
PhyloNetworks.parsimonyfitch_bottomup!Method
parsimonyfitch_bottomup!(node, states, score)

Bottom-up phase (from tips to root) of the Fitch algorithm: assign sets of character states to internal nodes based on character states at tips. Polytomies okay. Assumes a tree (no reticulation) and correct ischild1 attribute.

output: dictionary with state sets and most parsimonious score

source
PhyloNetworks.parsimonyfitch_topdown!Method
parsimonyfitch_topdown!(node, states)

Top-down phase (root to tips) of the Fitch algorithm: constrains character states at internal nodes based on the state of the root. Assumes a tree: no reticulation.

output: dictionary with state sets

source
PhyloNetworks.parsimonysoftwired_bottomup!Method
parsimonysoftwired_bottomup!(node, blobroot, states, w, scores)

Computing the MP scores (one for each assignment of the root state) of a swicthing as described in Algorithm 1 in the following paper:

Fischer, M., van Iersel, L., Kelk, S., Scornavacca, C. (2015). On computing the Maximum Parsimony score of a phylogenetic network. SIAM J. Discrete Math., 29(1):559-585.

Assumes a switching (ie correct boole2 field) and correct ischild1 field. The field booln4 is used to know which nodes are at the root of a blob.

source
PhyloNetworks.postorder_nodeupdate!Function
traversal_postorder(nodes, init_function,
    tip_function, internalnode_function, parameters...)
postorder_nodeupdate!(node_index, nodes, output_array,
    tip_function, internalnode_function, parameters...)

Generic tool to apply a post-order (or reverse topological ordering) algorithm, acting on a matrix where rows & columns correspond to nodes. Used by descendencematrix.

output: matrix output_array.

arguments:

  • nodes: array of nodes in the network, pre-ordered, typically the internal net.vec_node after applying preorder!(net). This array is traversed in reverse order.
  • output_array: output object, of type Matrix, named V below
  • init_function: to initialize the output array, taking (nodes, parameters...) as arguments
  • tip_function: to do whatever needs to be done to V at a leaf node, using (V, tip_index, parameters...) as arguments
  • internalnode_function: to do whatever needs to be done to V at an internal node, using arguments (V, node_index, childnodes_index_vector, childedges_vector, parameters...)

The last 3 functions should return a boolean. If true: traversal continues. If false: traversal is stopped, that is, the next node is not processed.

See also traversal_preorder, and traversalupdate_default! for a default function that does nothing to V and returns true, with an adequate signature to be used here.

source
PhyloNetworks.preorder_nodeupdate!Function
traversal_preorder(nodes, init_function, root_function, tree_node_function,
                  hybrid_node_function, parameters...)
traversal_preorder!(nodes, output_array, root_function, tree_node_function,
                   hybrid_node_function, parameters...)
preorder_nodeupdate!(node_index, nodes, output_array, root_function, tree_node_function,
               hybrid_node_function, parameters...)

Generic tool to apply a pre-order (or topological ordering) algorithm. Used by sharedpathmatrix and by pairwisetaxondistancematrix, for example. preorder_nodeupdate! is a helper that calls the root / tree node / hybrid node function as appropriate.

output: array object output_array.

arguments:

  • nodes: array of nodes in the network, pre-ordered, typically the internal net.vec_node after applying preorder!(net)
  • output_array: output object, of type AbstractArray, named V below
  • init_function: to initialize the output array, taking (nodes, parameters...) as arguments
  • root_function: to do whatever needs to be done to V at the root, using (V, rootnode_index, parameters...) as arguments
  • tree_node_function: to do whatever needs to be done to V at a tree node, using (V, treenode_index, parentnode_index, parentedge, parameters...) as arguments
  • hybrid_node_function: to do whatever needs to be done to V at a hybrid node, using arguments (V, hybnode_index, parentnodes_index_vector, parentedges_vector, parameters...)

The last 3 functions should return a boolean. If true: traversal continues. If false: traversal is stopped, that is, the next node is not processed.

See also traversal_postorder, and traversalupdate_default! for a default function that does nothing to V and returns true, with an adequate signature to be used here.

source
PhyloNetworks.problem4cycleMethod
problem4cycle(β::Node, δ::Node, α::Node, γ::Node)

Check if the focus edge uv has a 4 cycle that could lead to a 3 cycle after an chosen NNI. Return true if there is a problem 4 cycle, false if none.

source
PhyloNetworks.process_biconnectedcomponents!Function
process_biconnectedcomponents!(net, preorder=true)

Calculate biconnected components (aka "blobs" in binary networks, "blocks" more generally) and save them in net for re-use later. The field net.partition is reset to store the biconnected components, by storing them in preorder (a block comes before any of its descendents). Each one is stored as a Partition object, which stores a list of edges in the biconnected component (which partition edges).

Trivial blocks are formed by one cut-edge and its 2 adjacent nodes. They are stored.

The edge field .inte1 is used to store the biconnected component that the edge belongs to, that is, an edge belongs in net.partition[edge.inte1]. This is an internal coding that could break in the future, and that could be modified by other functions.

Warning

This preprocessing and net.partition become obsolete and incorrect if the network's topology is later modified, such as after a semidirected NNI, the deletion/addition of a hybrid edge / a leaf. Functions that modify a network are not required to update the blob decomposition (to save time for analyses that don't need the blob decomposition).

After a re-rooting with rootatnode!, the blobs remain the same with the same set of edges, but their entry & exit nodes may become incorrect. After rootonedge!, the number of blobs remains correct but one edge was subdivided into 2 edges (and the old root node might have been suppressed), so the edge list in some blobs becomes obsolete.

examples

The network below is binary, so each blocks and blobs are the same. It has 2 non-trivial blobs: one with 1 hybrid and the other with 2 hybrids. We first get the blobs, then look at the 2-hybrid blob in more detail.

julia> net = readnewick("(((D,#H2),((((E)#H2,C),#H1),((B)#H1,A))),((H,#H3),((F)#H3,G)));");

julia> # using PhyloPlots; plot(net, showedgenumber=true, shownodenumber=true);

julia> PhyloNetworks.process_biconnectedcomponents!(net);

julia> length(net.partition) # number of biconnected components
12

julia> findall(!PhyloNetworks.istrivial(blob) for blob in net.partition)
2-element Vector{Int64}:
 3
 7

julia> blob = net.partition[7]; getlevel(blob)
2

julia> [e.number for e in blob.edges]
9-element Vector{Int64}:
 14
  9
 13
 11
  8
  7
  5
  2
  3

julia> i = PhyloNetworks.entrynode_preindex(blob); net.vec_node[i] # entry to the blob
PhyloNetworks.Node:
 number:-3
 attached to 3 edges, numbered: 3 14 15

julia> PhyloNetworks.number_exitnodes(blob) # 5 exit nodes connecting to other blobs below
5

julia> [net.vec_node[i].number for i in PhyloNetworks.exitnodes_preindex(blob)]
5-element Vector{Int64}:
 -9
  5
 -7
  2
 -4
source
PhyloNetworks.quartetdata_columnnamesMethod
quartetdata_columnnames(T) where T <: StaticArray

Vector of column names to hold the quartet data of type T in a table. If T is a length-3 vector type, they are "CF1234","CF1324","CF1423". If T is a length-4 vector type, the 4th name is "ngenes". If T is a 3×n matrix type, the output vector contains 3×n names, 3 for each of "CF", "V2", "V3", ... "Vn".

Used by tablequartetCF to build a table from a vector of QuartetT objects.

source
PhyloNetworks.quartetrankMethod
quartetrank(t1,t2,t3,t4, nCk::Matrix)
quartetrank([t1,t2,t3,t4], nCk)

Return the rank of a four-taxon set with taxon numbers t1,t2,t3,t4, assuming that tis are positive integers such that t1<t2, t2<t3 and t3<t4 (assumptions not checked!). nCk should be a matrix of "n choose k" binomial coefficients: see nchoose1234.

examples

julia> nCk = PhyloNetworks.nchoose1234(5)
6×4 Matrix{Int64}:
 0   0   0  0
 1   0   0  0
 2   1   0  0
 3   3   1  0
 4   6   4  1
 5  10  10  5

julia> PhyloNetworks.quartetrank([1,2,3,4], nCk)
1

julia> PhyloNetworks.quartetrank([3,4,5,6], nCk)
15
source
PhyloNetworks.readcsvtoarrayMethod
readcsvtoarray(dat::DataFrame)
readcsvtoarray(filename::String)

Read a CSV table containing both species names and data, create two separate arrays: one for the species names, a second for the data, in a format that parsimonyGF needs.

Warning:

  • it will try to find a column 'taxon' or 'species' for the taxon names. If none found, it will assume the taxon names are in column 1.
  • will use all other columns as characters
source
PhyloNetworks.readfastatoarrayFunction
readfastatoarray(filename::AbstractString, sequencetype=BioSequences.LongDNA{4})

Read a fasta-formatted file. Return a tuple species, sequences where species is a vector of Strings with identifier names, and sequences is a vector of BioSequences, each of type sequencetype (DNA by default).

Warnings:

  • assumes a semi-sequential format, not interleaved. More specifically, each taxon name should appear only once. For this one time, the corresponding sequence may be broken across several lines though.
  • fails if all sequences aren't of the same length
source
PhyloNetworks.readnewick_nodenameMethod
readnewick_nodename(s::IO, c::Char, net, numLeft)

Auxiliary function to read a taxon name during newick parsing. output: tuple (number, name, pound_boolean)

Names may have numbers: numbers are treated as strings. Accepts # as part of the name (but excludes it from the name), in which case pound_boolean is true. # is used in extended newick to flag hybrid nodes.

Nexus-style comments following the node name, if any, are read and ignored.

source
PhyloNetworks.readnewick_subtree!Method
readnewick_subtree!(s::IO, parentNode, numLeft, net, hybrids)

Recursive helper method for readnewick: read a subtree from an extended Newick topology. input s: IOStream/IOBuffer.

Reads additional info formatted as: :length:bootstrap:gamma. Allows for name of internal nodes without # after closing parenthesis: (1,2)A. Warning if hybrid edge without γ, or if γ (ignored) without hybrid edge

source
PhyloNetworks.readnexus_commentMethod
readnexus_comment(s::IO, c::Char)

Read (and do nothing with) nexus-style comments: [& ... ] Assumption: 'c' is the next character to be read from s. Output: nothing.

Comments can appear after (or instead of) a node or leaf name, before or after an edge length, and after another comment.

source
PhyloNetworks.readnexus_extractgammaMethod
readnexus_extractgamma(nexus_string)

Extract γ from comments and return a dictionary hybrid number ID => γ, from one single phylogeny given as a string. The output from BEAST2 uses this format for reticulations at minor edges, as output by bacter (Vaughan et al. 2017):

#11[&conv=0, relSize=0.08, ...

or as output by SpeciesNetwork (Zhang et al. 2018):

#H11[&gamma=0.08]

The function below assumes that the "H" was already added back if not present already (from bacter), like this:

#H11[&conv=0, relSize=0.19, ...

The bacter format is tried first. If this format doesn't give any match, then the SpeciesNetwork format is tried next. See readnexus_assigngammas!.

source
PhyloNetworks.readnexus_translatetableMethod
readnexus_translatetable(io)

Read translate table from IO object io, whose first non-empty line should contain "translate". Then each line should have "number name" and the end of the table is indicated by a ;. Output tuple:

  • line that was last read, and is not part of the translate table, taken from io
  • translate: boolean, whether a table was successfully read
  • id2name: dictionary mapping number to name.
source
PhyloNetworks.removeHybrid!Method
removeHybrid!(net::Network, n::Node)

Delete a hybrid node n from net.hybrid, and update net.numHybrid. The actual node n is not deleted. It is kept in the full list net.node. Very internal function, used by deletehybridedge! and others.

source
PhyloNetworks.resetedgenumbers!Function
resetedgenumbers!(net::HybridNetwork, verbose=true)

Check that edge numbers of net are consecutive numbers from 1 to the total number of edges. If not, reset the edge numbers to be so.

source
PhyloNetworks.resetnodenumbers!Method
resetnodenumbers!(net::HybridNetwork; checkpreorder=true, type=:ape)

Change internal node numbers of net to consecutive numbers from 1 to the total number of nodes.

keyword arguments:

  • type: default is :ape, to get numbers that satisfy the conditions assumed by the ape R package: leaves are 1 to n, the root is n+1, and internal nodes are higher consecutive integers. If :postorder, nodes are numbered in post-order, with leaves from 1 to n (and the root last). If :internalonly, leaves are unchanged. Only internal nodes are modified, to take consecutive numbers from (max leaf number)+1 and up. With this last option, the post-ordering of nodes is by-passed.
  • checkpreorder: if false, the ischild1 edge field and the net.vec_node network field are supposed to be correct (to get nodes in preorder). This is not needed when type=:internalonly.

Examples

julia> net = readnewick("(A,(B,(C,D)));");

julia> PhyloNetworks.resetnodenumbers!(net)

julia> printnodes(net) # first column "node": root is 5
node leaf  hybrid name i_cycle edges'numbers
1    true  false  A    -1      1   
2    true  false  B    -1      2   
3    true  false  C    -1      3   
4    true  false  D    -1      4   
7    false false       -1      3    4    5   
6    false false       -1      2    5    6   
5    false false       -1      1    6   

julia> net = readnewick("(A,(B,(C,D)));");

julia> PhyloNetworks.resetnodenumbers!(net; type=:postorder)

julia> printnodes(net) # first column "node": root is 7
node leaf  hybrid name i_cycle edges'numbers
1    true  false  A    -1      1   
2    true  false  B    -1      2   
3    true  false  C    -1      3   
4    true  false  D    -1      4   
5    false false       -1      3    4    5   
6    false false       -1      2    5    6   
7    false false       -1      1    6   
source
PhyloNetworks.resolvetreepolytomy!Function
resolvetreepolytomy!(net::HybridNetwork, n::Node, e1::Edge, e2::Edge)

Create new node v and new edge n --> v, detach edges e1 and e2 from n to connect them to v instead. Return the new edge, whose child is v.

Warning: assumes that n is incident to both e1 and e2. It is best if these edges are originally children of n because the new edge is directed from n to the newly created node.

source
PhyloNetworks.resolvetreepolytomy!Method
resolvetreepolytomy!(net::HybridNetwork, n::Node)

Create new nodes and edges from a tree node n, until it has degree 3. It is best if the network's tree edges are correctly directed beforehand.

source
PhyloNetworks.samplebootstrap_multilociMethod
samplebootstrap_multiloci(vector of tree lists; seed=0, generesampling=false, row=0)
samplebootstrap_multiloci!(tree list, vector of tree lists; seed=0, generesampling=false, row=0)

Sample bootstrap gene trees, 1 tree per gene. Set the seed with keyword argument seed, which is 0 by default. When seed=0, the actual seed is set using the clock. Assumes a vector of vectors of networks (see readmultinewick_files), each one of length 1 or more (error if one vector is empty, tested in bootsnaq).

  • site resampling: always, from sampling one bootstrap tree from each given list. This tree is sampled at random unless row>0 (see below).
  • gene resampling: if generesampling=true (default is false), genes (i.e. lists) are sampled with replacement.
  • row=i: samples the ith bootstrap tree for each gene. row is turned back to 0 if gene resampling is true.

output: one vector of trees. the modifying function (!) modifies the input tree list and returns it.

source
PhyloNetworks.shrink2cycleat!Method
shrink2cycleat!(net::HybridNetwork, minor::Edge, major::Edge, unroot::Bool)

Remove minor edge then update the branch length of the remaining major edge. Called by shrink2cycles!

Assumption: minor and major do form a 2-cycle. That is, they start and end at the same node.

source
PhyloNetworks.shrink3cycleat!Method
shrink3cycleat!(net::HybridNetwork, hybrid::Node, edge1::Edge, edge2::Edge,
                node1::Node, node2::Node, unroot::Bool)

Replace a 3-cycle at a given hybrid node by a single node, if any. Assumption: edge1 (node1) and edge2 (node2) are the parent edges (nodes) of hybrid. Return true if a 3-cycle is found and removed, false otherwise. There is a 3-cycle if nodes 1 & 2 are connected, by an edge called e3 below.

There are two cases, with differing effects on the γ inheritance values and branch lengths.

Hybrid case: the 3-cycle is shrunk to a hybrid node, which occurs if either node 1 or 2 is a hybrid node (that is, e3 is hybrid). If e3 goes from node 1 to node 2, the 3-cycle (left) is shrunk as on the right:

\eA      /eB           \eA  /eB
 1--e3->2       γ1+γ2γ3 \  / γ2(1-γ3)
  \    /               hybrid
 γ1\  /γ2
  hybrid

with new branch lengths: new tA = tA + (γ1.t1 + γ2γ3.(t2+t3))/(γ1+γ2γ3), new tB = tB + t2, provided that γ1, γ2=1-γ1, and γ3 are not missing. If one of them is missing then γ1 and γ2 remain as is, and e3 is deleted naively, such that new tA = tA + t1 and new tB = tB + t2. If γ's are not missing but one of t1,t2,t3 is missing, then the γ's are updated to γ1+γ2γ3 and γ2(1-γ3), but t's are update naively.

Tree case: the 3-cycle is shrunk to a tree node, which occurs if node 1 & 2 are both tree nodes (that is, e3 is a tree edge). If eC is the child edge of hybrid, the 3-cycle (left) is shrunk as on the right:

\eA                  \eA
 1--e3--2--eB--       \
  \    /               n--eB--
 γ1\  /γ2              |
  hybrid               |eC
    |
    |eC

with new branch lengths: new tA = tA + γ2.t3, new tB = tB + γ1.t3, new tC = tC + γ1.t1 + γ2.t2, provided that γ1, γ2=1-γ1, t1, t2 and t3 are not missing. If one is missing, then e1 is deleted naively such that tB is unchanged, new tC = tC + t2 and new tA = tA + t3.

source
PhyloNetworks.shrinkedge!Method
shrinkedge!(net::HybridNetwork, edge::Edge)

Delete edge from net, provided that it is a non-external tree edge. Specifically: delete its child node (as determined by ischild1) and connect all edges formerly incident to this child node to the parent node of edge, thus creating a new polytomy, unless the child was of degree 2.

Warning: it's best for ischild1 to be in sync with the root for this. If not, the shrinking may fail (if edge is a tree edge but its "child" is a hybrid) or the root may change arbitrarily (if the child of edge is the root).

Output: true if the remaining node (parent of edge) becomes a hybrid node with more than 1 child after the shrinking; false otherwise (e.g. no polytomy was created, or the new polytomy is below a tree node)

source
PhyloNetworks.startingBL!Function
startingBL!(net::HybridNetwork,
            trait::AbstractVector{Vector{Union{Missings.Missing,Int}}},
            siteweight::AbstractVector{Float64}=ones(length(trait[1])))

Calibrate branch lengths in net by minimizing the mean squared error between the JC-adjusted pairwise distance between taxa, and network-predicted pairwise distances, using calibratefrompairwisedistances!. The network is not forced to be time-consistent nor ultrametric. To avoid one source of non-identifiability, the network is "zipped" by forcing minor hybrid edges to have length 0.

siteweight[k] gives the weight of site (or site pattern) k (default: all 1s).

Assumptions:

  • all species have the same number of traits (sites): length(trait[i]) constant
  • trait[i] is for leaf with node.number = i in net, and trait[i][j] = k means that leaf number i has state index k for trait j. These indices are those used in a substitution model (see PhyloTraits.jl): kth value of getlabels(model).

Other:

  • Hamming distances are calculated for each pair, ignoring any site in which one of 2 values is missing. The total number of differences is then divided by the total weight of all sites, ignoring that some of them may have been missing for the pair. This is an inexact rescaling of the hamming distance, assuming a small proportion of missing values for each pair.
  • Theoretically, Hamming distances are < 0.75 with four states, or < (n-1)/n for n states. If not, all pairwise hamming distances are scaled by .75/(m*1.01) where m is the maximum observed hamming distance, to make them all < 0.75. The JC correction calculates: dJC = - 0.75 log(1-dhamming/0.75) for n=4 states
  • At the end, any edge length smaller than 1.0e-10 is reset to 0.0001, to avoid the 0 boundary.
source
PhyloNetworks.startree_newickFunction
startree_newick(n, l)

String for the Newick parenthetical description of the star tree with n tips, and all branch lengths equal to l.

source
PhyloNetworks.symmetricnet_newickFunction
symmetricnet_newick(n::Int, h::Int, γ::Real)

Create a string for a symmetric network with 2^n tips, numbered from 1 to 2^n, with a symmetric major tree, whose branch lengths are all equal. 2^(n-h) hybrids are added from level h to h-1 "symmetrically". The network is time-consistent and ultrametric, with a total height of 1.

source
PhyloNetworks.symmetricnet_newickMethod
symmetricnet_newick(n::Int, i::Int, j::Int, γ::Real, l::Real)

Newick string for a network with a symmetric major tree with 2^n tips, numbered from 1 to 2^n. All the branch lengths of the major tree are set to l. One hybrid branch, going from level i to level j is added, cutting in half each initial edge in the tree. The new edge has length l and inheritance γ.

source
PhyloNetworks.symmetrictree_newickFunction
symmetrictree_newick(n::Int, ell::Real, i=1)

String for the Newick parenthetical description of a symmetric tree with 2^n tips, numbered from i to i+2^n-1. All branch lengths are set equal to ell. The tree can be created later by reading the string with readnewick.

source
PhyloNetworks.synchronizepartnersdata!Method
synchronizepartnersdata!(e::Edge, n::Node)

Synchronize γ and ismajor for edges e and its partner, both hybrid edges with the same child n:

  • if one γ is missing and the other is not: set the missing γ to 1 - the other
  • γ's should sum up to 1.0
  • update ismajor to match the γ information: the major edge is the one with γ > 0.5.

Warnings: does not check that e is a hybrid edge, nor that n is the child of e.

source
PhyloNetworks.timeinconsistency_averageMethod
timeinconsistency_average(
    candidate_nodeheights,
    missingparent_heights,
    isconsistent::Ref{Bool},
    parent_edges,
    nonmissingparent_j;
    atol::Real=1e-8, rtol::Real=√eps(Float64))

Calculate the γ-weighted average node height of a given hybrid node h, based on the candidate node heights from its parents with non-missing edge lengths.

  • If some of these parent edges have a missing γ, then equal weights are used and a warning is issued.
  • If the hybrid node's average height (calculate from non-missing lengths) turns out to be lower than one of the parent's height with a missing length (such that this parent edge would need to be assigned a negative value) then a warning is issued.

Outcome:

  • update isconsistent to false if the candidate node heights are not all equal or if some missing edge length would have to be assigned a negative value to make the network time-consistent
  • returns (true, nodeheight)

Assumption: candidate_nodeheight is not empty, that is, the node has at least one parent edge with a non-missing length.

See also _getnodeheights and getnodeheights_average

source
PhyloNetworks.timeinconsistency_checkFunction
timeinconsistency_error(
    candidate_nodeheights,
    missingparent_heights,
    args...;
    atol::Real=1e-8, rtol::Real=√eps(Float64))
timeinconsistency_check

Check that all candidate node heights are approximately equal to one another, and that this shared value nodeheight is higher (farther from the root) than the height of parents connected by edge of missing length missingparent_heights to ensure that these edge lengths would be assigned non-negative values.

If any of these conditions is not met, timeinconsistency_error throws an error. Otherwise, it returns (true, nodeheight) where true means that the network is (or could be) time-consistent at the node being considered. timeinconsistency_check returns (is_timeconsistent, nodeheight) but does not throw an error the is_timeconsistent if false (for either reason).

Assumption: candidate_nodeheight is not empty, that is, the node has at least one parent edge with a non-missing length.

See also _getnodeheights

source
PhyloNetworks.timeinconsistency_errorMethod
timeinconsistency_error(
    candidate_nodeheights,
    missingparent_heights,
    args...;
    atol::Real=1e-8, rtol::Real=√eps(Float64))
timeinconsistency_check

Check that all candidate node heights are approximately equal to one another, and that this shared value nodeheight is higher (farther from the root) than the height of parents connected by edge of missing length missingparent_heights to ensure that these edge lengths would be assigned non-negative values.

If any of these conditions is not met, timeinconsistency_error throws an error. Otherwise, it returns (true, nodeheight) where true means that the network is (or could be) time-consistent at the node being considered. timeinconsistency_check returns (is_timeconsistent, nodeheight) but does not throw an error the is_timeconsistent if false (for either reason).

Assumption: candidate_nodeheight is not empty, that is, the node has at least one parent edge with a non-missing length.

See also _getnodeheights

source
PhyloNetworks.timeinconsistency_majortreeMethod
timeinconsistency_majortree(
    candidate_nodeheights,
    missingparent_heights,
    isconsistent::Ref{Bool},
    parent_edges,
    nonmissingparent_j;
    atol::Real=1e-8, rtol::Real=√eps(Float64))

Calculate node height of a given hybrid node h, based on the its major parent node height, if its major parent edge has a non-missing length. If missing, then the non-missing edge length with the largest γ is used and a warning is issued. If all parent edges have missing γ values then an error is thrown.

Outcome:

  • update isconsistent to false if the candidate node heights are not all equal or if some missing edge length would have to be assigned a negative value to make the network time-consistent
  • returns (true, nodeheight)

Assumption: candidate_nodeheight is not empty, that is, the node has at least one parent edge with a non-missing length.

See also _getnodeheights and getnodeheights_average

source
PhyloNetworks.traversal_postorderMethod
traversal_postorder(nodes, init_function,
    tip_function, internalnode_function, parameters...)
postorder_nodeupdate!(node_index, nodes, output_array,
    tip_function, internalnode_function, parameters...)

Generic tool to apply a post-order (or reverse topological ordering) algorithm, acting on a matrix where rows & columns correspond to nodes. Used by descendencematrix.

output: matrix output_array.

arguments:

  • nodes: array of nodes in the network, pre-ordered, typically the internal net.vec_node after applying preorder!(net). This array is traversed in reverse order.
  • output_array: output object, of type Matrix, named V below
  • init_function: to initialize the output array, taking (nodes, parameters...) as arguments
  • tip_function: to do whatever needs to be done to V at a leaf node, using (V, tip_index, parameters...) as arguments
  • internalnode_function: to do whatever needs to be done to V at an internal node, using arguments (V, node_index, childnodes_index_vector, childedges_vector, parameters...)

The last 3 functions should return a boolean. If true: traversal continues. If false: traversal is stopped, that is, the next node is not processed.

See also traversal_preorder, and traversalupdate_default! for a default function that does nothing to V and returns true, with an adequate signature to be used here.

source
PhyloNetworks.traversal_preorder!Function
traversal_preorder(nodes, init_function, root_function, tree_node_function,
                  hybrid_node_function, parameters...)
traversal_preorder!(nodes, output_array, root_function, tree_node_function,
                   hybrid_node_function, parameters...)
preorder_nodeupdate!(node_index, nodes, output_array, root_function, tree_node_function,
               hybrid_node_function, parameters...)

Generic tool to apply a pre-order (or topological ordering) algorithm. Used by sharedpathmatrix and by pairwisetaxondistancematrix, for example. preorder_nodeupdate! is a helper that calls the root / tree node / hybrid node function as appropriate.

output: array object output_array.

arguments:

  • nodes: array of nodes in the network, pre-ordered, typically the internal net.vec_node after applying preorder!(net)
  • output_array: output object, of type AbstractArray, named V below
  • init_function: to initialize the output array, taking (nodes, parameters...) as arguments
  • root_function: to do whatever needs to be done to V at the root, using (V, rootnode_index, parameters...) as arguments
  • tree_node_function: to do whatever needs to be done to V at a tree node, using (V, treenode_index, parentnode_index, parentedge, parameters...) as arguments
  • hybrid_node_function: to do whatever needs to be done to V at a hybrid node, using arguments (V, hybnode_index, parentnodes_index_vector, parentedges_vector, parameters...)

The last 3 functions should return a boolean. If true: traversal continues. If false: traversal is stopped, that is, the next node is not processed.

See also traversal_postorder, and traversalupdate_default! for a default function that does nothing to V and returns true, with an adequate signature to be used here.

source
PhyloNetworks.traversal_preorderMethod
traversal_preorder(nodes, init_function, root_function, tree_node_function,
                  hybrid_node_function, parameters...)
traversal_preorder!(nodes, output_array, root_function, tree_node_function,
                   hybrid_node_function, parameters...)
preorder_nodeupdate!(node_index, nodes, output_array, root_function, tree_node_function,
               hybrid_node_function, parameters...)

Generic tool to apply a pre-order (or topological ordering) algorithm. Used by sharedpathmatrix and by pairwisetaxondistancematrix, for example. preorder_nodeupdate! is a helper that calls the root / tree node / hybrid node function as appropriate.

output: array object output_array.

arguments:

  • nodes: array of nodes in the network, pre-ordered, typically the internal net.vec_node after applying preorder!(net)
  • output_array: output object, of type AbstractArray, named V below
  • init_function: to initialize the output array, taking (nodes, parameters...) as arguments
  • root_function: to do whatever needs to be done to V at the root, using (V, rootnode_index, parameters...) as arguments
  • tree_node_function: to do whatever needs to be done to V at a tree node, using (V, treenode_index, parentnode_index, parentedge, parameters...) as arguments
  • hybrid_node_function: to do whatever needs to be done to V at a hybrid node, using arguments (V, hybnode_index, parentnodes_index_vector, parentedges_vector, parameters...)

The last 3 functions should return a boolean. If true: traversal continues. If false: traversal is stopped, that is, the next node is not processed.

See also traversal_postorder, and traversalupdate_default! for a default function that does nothing to V and returns true, with an adequate signature to be used here.

source
PhyloNetworks.unzip_canonical!Method
unzip_canonical!(net::HybridNetwork)

Unzip all reticulations: set the length of child edge to 0, and increase the length of both parent edges by the original child edge's length, to obtain the canonical version of the network according to Pardi & Scornavacca (2015).

Output: vector of hybrid node in postorder, vector of child edges whose length is constrained to be 0, and vector of their original branch lengths to re-zip if needed using rezip_canonical!.

Assumption: net.hybrid is correct, but a preordering of all nodes is not assumed.

Note: This unzipping is not as straightforward as it might seem, because of "nested" zippers: when the child of a hybrid node is itself a hybrid node. The unzipping is propagated all the way through.

source
PhyloNetworks.updateconstraintfields!Method
updateconstraintfields!(constraints::Vector{TopologyConstraint}, net::HybridNetwork)

Update fields stem edge and crown node to match the given net.

Assumes that the constraints are still met in net, and that nodes & edges are numbered identically in net as in the network used to create all constraints.

fixit: remove the assumption that constraints are still met, since an NNI near the crown of a constrained clade might change the crown node and / or the stem edge (u and v exchange).

source
PhyloNetworks.updateconstraints!Method
updateconstraints!(constraints::Vector{TopologyConstraint}, net::HybridNetwork)

Update the set taxonnum in each constraint, assuming that the stem edge and the crown node are still correct, and that their descendants are still correct. May be needed if the node and edge numbers were modified by resetnodenumbers! or resetedgenumbers!.

Warning: does not check that the names of leaves with numbers in taxonnum are taxonnames.

julia> net = readnewick("(((2a,2b),(((((1a,1b,1c),4),(5)#H1),(#H1,(6,7))))#H2),(#H2,10));");

julia> c_species1 = PhyloNetworks.TopologyConstraint(0x01, ["1a","1b","1c"], net)
Species constraint, on tips: 1a, 1b, 1c
 stem edge number 7
 crown node number -9

julia> c_species1.taxonnums
Set{Int64} with 3 elements:
  5
  4
  3

julia> c_clade145 = PhyloNetworks.TopologyConstraint(0x02, ["1a","1b","1c","4","5"], net)
Clade constraint, on tips: 1a, 1b, 1c, 4, 5
 stem edge number 12
 crown node number -7

julia> PhyloNetworks.resetnodenumbers!(net)

julia> net.node[4].number = 111;

julia> PhyloNetworks.updateconstraints!([c_species1, c_clade145], net)

julia> c_species1
Species constraint, on tips: 1a, 1b, 1c
 stem edge number 7
 crown node number 21

julia> c_species1.taxonnums
Set{Int64} with 3 elements:
  5
  4
  111
source

index