public documentation

Documentation for PhyloSummaries's public (exported) functions. Most functions are internal (not exported).

functions & types

PhyloSummaries.blobpartitions_supportMethod
blobpartitions_support(networks, referencenet;
    minimumblobdegree=3, netweights=nothing)

Calculate the support for blob partitions and related features (circular orders, hybrid clades, bipartitions non-redundant with a blob) for the blobs present in a reference network referencenet, based on their frequency in a sample of networks.

Sample networks are weighted equally by default, unless a vector of netweights is provided (of same length as the vector of sample networks).

Output: a NamedTuple of tables, named as follows

  • :blob_table: support for each blob partition in the reference network, including support for the circular order of the taxon blocks if the sample networks are of level 1.
  • :hybrid_table: support for each taxon block to be below a blob's lowest hybrid node in a sample network.
  • :bipartition_table: support for each non-redundant bipartition (cut-edge) in the reference network.
  • :taxa: sorted taxon labels shared by all networks. taxa[i] is indicated as taxon i in the other tables.

See also: PhyloNetworks.treeedges_support, PhyloNetworks.hybridclades_support, consensus_treeofblobs, PhyloSummaries.count_blobpartitions.

example

julia> netfile = joinpath(dirname(pathof(PhyloSummaries)), "..", "test","level1_7taxa_abc.nwk");

julia> bootnet = readmultinewick(netfile); # could be bootstrap networks

julia> nwk = "(((a3,(a4,#H1)),a2),(((c2,(#H2,c1)),(b1)#H2))#H1,a1);";

julia> refnet = readnewick(nwk); # same as 5th bootnet

julia> res = blobpartitions_support(bootnet, refnet);

julia> keys(res)
(:blob_table, :circorder_table, :hybrid_table, :bipartition_table, :taxa)

julia> using DataFrames; DataFrame(res[:blob_table])
2×6 DataFrame
 Row │ blob   degree  node   support_partition  partition_num  partition            
     │ Int64  Int64   Int64  Float64            String         String               
─────┼──────────────────────────────────────────────────────────────────────────────
   1 │     2       4     -7                0.4  1,2,3,4|7|6|5  a1,a2,a3,a4|c2|c1|b1
   2 │     1       5     -2                0.6  1|2|3|4|5,6,7  a1|a2|a3|a4|b1,c1,c2

julia> DataFrame(res[:circorder_table])
2×5 DataFrame
 Row │ blob   order            support_circorder  partition_num  partition            
     │ Int64  Tuple…           Float64            String         String               
─────┼────────────────────────────────────────────────────────────────────────────────
   1 │     2  (1, 2, 3, 4)                   0.4  1,2,3,4|7|6|5  a1,a2,a3,a4|c2|c1|b1
   2 │     1  (1, 2, 3, 4, 5)                0.6  1|2|3|4|5,6,7  a1|a2|a3|a4|b1,c1,c2

julia> DataFrame(res[:hybrid_table])
3×7 DataFrame
 Row │ blob   node_from  node_to  edge   support_hybrid  cluster_num  cluster     
     │ Int64  Int64      Int64    Int64  Float64         String       String      
─────┼────────────────────────────────────────────────────────────────────────────
   1 │     2          6        8     13             0.2  5            b1
   2 │     2         -7        3     15             0.4  1,2,3,4      a1,a2,a3,a4
   3 │     1          3       -7     15             0.6  5,6,7        b1,c1,c2

julia> DataFrame(res[:bipartition_table]) # refnet has 0 non-redundant bipartitions
0×6 DataFrame
 Row │ node1  node2  edge   support_nonredundant  cluster_num  cluster 
     │ Int64  Int64  Int64  Float64               String       String  
─────┴─────────────────────────────────────────────────────────────────

To plot these support values onto the reference network, see examples in the package manual.

source
PhyloSummaries.consensus_level1networkMethod
consensus_level1network(networks; proportion=0,
    minimumblobdegree=4, outgroup=nothing, netweights=nothing)

Consensus network summarizing a list of level-1 networks, by these steps:

  1. A consensus tree of blobs is built as in consensus_treeofblobs, with one node for each blob present in a majority (or in more than the required proportion) of input networks, if compatible with blobs of higher support, with non-redundant bipartitions supported by more than the proportion of input networks.
  2. Each blob is resolved as a cycle, one after another, from highest to lowest supported blobs.
  3. To resolve a blob, its taxon blocks are placed around a cycle in the circular order most-frequently found in input networks.
  4. To orient the edges in the cycle, the node chosen to be hybrid is the one whose descendant clade has the highest (or second-highest) support as being a hybrid clade, among the placements that are compatible with each other.
    • If an outgroup is provided, the hybrid node is chosen among those that do not conflict with this taxon being an outgroup: direct child the root.
    • Otherwise, the total hybrid support is maximized (sum of hybrid support over all chosen hybrid clades)

By default, a greedy consensus consensus is calculated. The majority-rule tree can be obtained by using proportion=0.5, and the strict consensus using proportion=1.

By default, all input networks have equal weight: the support for a feature (blob, circular order, hybrid clade, bipartition) is the proportion of networks with this feature. To give networks unequal weights, a netweights vector can be provided. There should be as many weights as there are input networks.

See consensus_level1network_save to save the output.

source
PhyloSummaries.consensus_level1network_saveFunction
consensus_level1network_save(result_object, rootname=nothing)

Write to files the results of consensus_level1network to 4 files, with file names starting with rootname and ending with "net.nwk" for the consensus network (with bipartition support as non-redundant with blobs), "blob.csv" for the table about blobs and their support, "hybrid.csv" for the table about hybrid clusters and their support, and "bipartition.csv" for the table about bipartitions and their support (as non-redundant with blobs).

For the hybrid and bipartition tables, the column for the edge number is not saved because re-reading the network from the newick file would most likely lead to different internal edge numbers, and using obsolete edge numbers could then cause unintended errors. Node numbers will also be different when re-reading the network from the newick file, but the original node numbers can be recovered with resetnodenumbers_fromnames!

input:

  • result_object should be 1 object, as output by consensus_level1network: named tuple including the keys :blob_table, :hybrid_table and :bipartition_table.
  • rootname: string, all output files will start with this.
Files will be overwritten, if they already exist
source
PhyloSummaries.consensus_treeofblobsMethod
consensus_treeofblobs(networks; proportion=0,
    minimumblobdegree=4, netweights=nothing)

Consensus tree summarizing the partitions of "interesting" blobs (nodes in the tree of blobs) and the non-redundant bipartitions (cut-edges connecting non-interesting blobs) that are shared by more than the required proportion of input trees. An error is thrown if the list of input networks is empty, or if the input networks do not all have the same tip labels.

This is an unrooted tree

This tree is to be considered as unrooted. It is built arbitrarily rooted at the last alphabetical taxon. Use rootatnode! or rootonedge! to re-root this tree given external knowledge of the outgroup (taxon or clade).

The support for a blob is the proportion of input networks that have a blob with this partition. This is stored in the corresponding node's .fvalue. The support for a bipartition as non-redundant is the proportion of input networks that have this bipartition not adjacent to any interesting blob. This is stored in the corresponding edge's field .y. With option supportaslength=true, this is also stored in the edge's .length. This option is not recommended and may be removed.

By default, all input networks have equal weight: the support for a feature (blob or bipartition) is the proportion of networks with this feature. Optionally, a vector of netweights can be provided, to give networks unequal weights. There should be as many weights as there are input networks.

An "interesting" blob in an input network N is a non-trivial blob (with at least one hybrid node) of degree m ≥ 4 by default. The degree of a blob is the number of cut edges it is adjacent to, and also the degree of the associated node in N's tree of blobs. Setting minimumblobdegree to 3 will cause non-trivial blobs to be considered "interesting" even if their corresponding node in N's tree of blobs is of degree 3.

Note that a node of degree 4 or more in the network's tree of blob may correspond to a polytomy in N: a single node incident to m cut-edges, but without any reticulation. These blobs are considered "non-interesting". A cut-edge incident to such a polytomy is then non-redundant, if the other blob it connects to is also non-interesting.

A chain of 2-blob leads to multiple cut-edge sharing the same bipartition. This bipartition is counted only once (if non-trivial and non-redundant) as if 2-blobs had been suppressed in the input network.

By default, a greedy consensus consensus is calculated. The majority-rule tree can be obtained by using proportion=0.5, and the strict consensus using proportion=1.

See also: consensus_level1network, count_blobpartitions!

source
PhyloSummaries.consensustreeMethod
consensustree(trees::AbstractVector{PN.HybridNetwork};
              rooted=false,
              proportion=0,
              supportaslength=false)

Consensus tree summarizing the bipartitions (or clades) shared by more than the required proportion of input trees. An ArgumentError is thrown if one input network is not a tree, or the list of input trees is empty, or if the input trees do not all have the same tip labels. Input trees are not modified.

Output: consensus tree as an object of type HybridNetwork.

The bipartition (or clade) support values are stored

  • as edge length with option supportaslength=true (not by default).
  • in the field .y of each internal edge (as external edges correspond to trivial bipartitions, which must necessarily be in all sampled trees). This field is used by writenewick to write edge support, with its option support=true. However, this .y field is internal, so it can be modified by other functions and should not be relied upon.

By default, input trees are considered unrooted, and bipartitions are considered. Use rooted=true to consider all input trees as rooted, in which case clades (rather than bipartitions) are used to build the output rooted consensus tree.

By default, the greedy consensus consensus is calculated: the tree is built from the bipartitions (or clades) with the highest support, until no more can be added. The majority-rule tree can be obtained by using proportion=0.5: it is built only from bipartitions (or clades) present in more than 50% of the input trees.

assumptions and warnings:

  • Input trees are assumed to have their edges correctly directed. If unsure, run directedges!.(trees) prior.
  • Input trees should not have degree-2 nodes other than the root (nodes with 1 only parent and 1 child). If unsure, run removedegree2nodes!.(trees, true)) to keep their root even of degree 2 or removedegree2nodes!.(trees, false)) unroot them also.

example

julia> nwk = ["((c,d),((a1,a2),b));", "(((a2,a1),b),c,d);", "(((a1,a2),c),d,b);"];

julia> treesample = readnewick.(nwk);

julia> con = consensustree(treesample); writenewick(con, round=true, support=true)
"(c,d,(b,(a1,a2)::1.0)::0.667);"

julia> con = consensustree(treesample; rooted=true); # greedy consensus

julia> writenewick(con, round=true, support=true)
"((d,c)::0.333,(b,(a1,a2)::1.0)::0.667);"

julia> [e.number => round(e.y, digits=3) for e in con.edge if !isexternal(e)] # edge number -> support
3-element Vector{Pair{Int64, Float64}}:
 6 => 0.333
 7 => 0.667
 8 => 1.0

julia> con = consensustree(treesample; rooted=true, proportion=0.5); # majority-rule

julia> writenewick(con, round=true, support=true)
"(c,d,(b,(a1,a2)::1.0)::0.667);"

julia> con = consensustree(treesample; proportion=0.75, supportaslength=true) |> writenewick
"(b,c,d,(a2,a1):1.0);"
source
PhyloSummaries.edgenumbers_fromnodenumbersMethod
edgenumbers_fromnodenumbers(n1_nums::Vector, n2_nums::Vector, net::PN.HybridNetwork)

Vector of edge numbers, of edges in net between pairs of nodes of specified numbers. An error is thrown if, in one pair, the two nodes are not adjacent. This is to avoid unknowingly using wrong edge numbers for the other pairs of nodes.

Uses PhyloNetworks.getconnectingedge.

source
PhyloSummaries.edgenumbers_fromnodenumbersMethod
edgenumbers_fromnodenumbers(table, net::PN.HybridNetwork)

Vector of edge numbers in net, with element i corresponding to the edge connecting (incident to) the two nodes listed in row i of the input table. This table may be a data frame or a named tuple, such as produced by consensus_treeofblobs and consensus_level1network. It should have 2 columns with node numbers, either named :node_from and :node_to, or :node1 and :node2.

source
PhyloSummaries.resetnodenumbers_fromnames!Method
resetnodenumbers_fromnames!(net)

Reset node numbers so that:

  • leaves are numbered 1 through the number of taxa, in alphabetical order of taxon names
  • any internal node named _n* is numbered n, which should be an integer that is not starting with 0, and * is what comes after
  • any hybrid node named Hn is numbered n
  • any node with an empty name is given some other number, so that node numbers are unique.

An error occurs if an internal node has a name that cannot be parsed as described above, or results in a number that is ≤ the number of taxa, or results in duplicated numbers.

See also: PhyloNetworks.resetedgenumbers! and PhyloNetworks.resetnodenumbers!.

example

julia> net = readnewick("((t2,(t1,(t4,#H11)_10))_8,(t3)#H11)_7_blob1;");

julia> printnodes(net)
node leaf  hybrid name     i_cycle edges'numbers
1    true  false  t2       -1      1   
2    true  false  t1       -1      2   
3    true  false  t4       -1      3   
5    false false  _10      -1      3    4    5   
-4   false false           -1      2    5    6   
6    false false  _8       -1      1    6    7   
7    true  false  t3       -1      8   
4    false true   H11      -1      8    4    9   
-2   false false  _7_blob1 -1      7    9   

julia> resetnodenumbers_fromnames!(net);

julia> printnodes(net)
node leaf  hybrid name     i_cycle edges'numbers
2    true  false  t2       -1      1   
1    true  false  t1       -1      2   
4    true  false  t4       -1      3   
10   false false  _10      -1      3    4    5   
5    false false           -1      2    5    6   
8    false false  _8       -1      1    6    7   
3    true  false  t3       -1      8   
11   false true   H11      -1      8    4    9   
7    false false  _7_blob1 -1      7    9   
source

index