public documentation
Documentation for PhyloSummaries's public (exported) functions. Most functions are internal (not exported).
functions & types
PhyloSummaries.blobpartitions_support — Method
blobpartitions_support(networks, referencenet;
minimumblobdegree=3, netweights=nothing)Calculate the support for blob partitions and related features (circular orders, hybrid clades, bipartitions non-redundant with a blob) for the blobs present in a reference network referencenet, based on their frequency in a sample of networks.
Sample networks are weighted equally by default, unless a vector of netweights is provided (of same length as the vector of sample networks).
Output: a NamedTuple of tables, named as follows
:blob_table: support for each blob partition in the reference network, including support for the circular order of the taxon blocks if the sample networks are of level 1.:hybrid_table: support for each taxon block to be below a blob's lowest hybrid node in a sample network.:bipartition_table: support for each non-redundant bipartition (cut-edge) in the reference network.:taxa: sorted taxon labels shared by all networks.taxa[i]is indicated as taxoniin the other tables.
See also: PhyloNetworks.treeedges_support, PhyloNetworks.hybridclades_support, consensus_treeofblobs, PhyloSummaries.count_blobpartitions.
example
julia> netfile = joinpath(dirname(pathof(PhyloSummaries)), "..", "test","level1_7taxa_abc.nwk");
julia> bootnet = readmultinewick(netfile); # could be bootstrap networks
julia> nwk = "(((a3,(a4,#H1)),a2),(((c2,(#H2,c1)),(b1)#H2))#H1,a1);";
julia> refnet = readnewick(nwk); # same as 5th bootnet
julia> res = blobpartitions_support(bootnet, refnet);
julia> keys(res)
(:blob_table, :circorder_table, :hybrid_table, :bipartition_table, :taxa)
julia> using DataFrames; DataFrame(res[:blob_table])
2×6 DataFrame
Row │ blob degree node support_partition partition_num partition
│ Int64 Int64 Int64 Float64 String String
─────┼──────────────────────────────────────────────────────────────────────────────
1 │ 2 4 -7 0.4 1,2,3,4|7|6|5 a1,a2,a3,a4|c2|c1|b1
2 │ 1 5 -2 0.6 1|2|3|4|5,6,7 a1|a2|a3|a4|b1,c1,c2
julia> DataFrame(res[:circorder_table])
2×5 DataFrame
Row │ blob order support_circorder partition_num partition
│ Int64 Tuple… Float64 String String
─────┼────────────────────────────────────────────────────────────────────────────────
1 │ 2 (1, 2, 3, 4) 0.4 1,2,3,4|7|6|5 a1,a2,a3,a4|c2|c1|b1
2 │ 1 (1, 2, 3, 4, 5) 0.6 1|2|3|4|5,6,7 a1|a2|a3|a4|b1,c1,c2
julia> DataFrame(res[:hybrid_table])
3×7 DataFrame
Row │ blob node_from node_to edge support_hybrid cluster_num cluster
│ Int64 Int64 Int64 Int64 Float64 String String
─────┼────────────────────────────────────────────────────────────────────────────
1 │ 2 6 8 13 0.2 5 b1
2 │ 2 -7 3 15 0.4 1,2,3,4 a1,a2,a3,a4
3 │ 1 3 -7 15 0.6 5,6,7 b1,c1,c2
julia> DataFrame(res[:bipartition_table]) # refnet has 0 non-redundant bipartitions
0×6 DataFrame
Row │ node1 node2 edge support_nonredundant cluster_num cluster
│ Int64 Int64 Int64 Float64 String String
─────┴─────────────────────────────────────────────────────────────────
To plot these support values onto the reference network, see examples in the package manual.
PhyloSummaries.consensus_level1network — Method
consensus_level1network(networks; proportion=0,
minimumblobdegree=4, outgroup=nothing, netweights=nothing)Consensus network summarizing a list of level-1 networks, by these steps:
- A consensus tree of blobs is built as in
consensus_treeofblobs, with one node for each blob present in a majority (or in more than the requiredproportion) of input networks, if compatible with blobs of higher support, with non-redundant bipartitions supported by more than theproportionof input networks. - Each blob is resolved as a cycle, one after another, from highest to lowest supported blobs.
- To resolve a blob, its taxon blocks are placed around a cycle in the circular order most-frequently found in input networks.
- To orient the edges in the cycle, the node chosen to be hybrid is the one whose descendant clade has the highest (or second-highest) support as being a hybrid clade, among the placements that are compatible with each other.
- If an
outgroupis provided, the hybrid node is chosen among those that do not conflict with this taxon being an outgroup: direct child the root. - Otherwise, the total hybrid support is maximized (sum of hybrid support over all chosen hybrid clades)
- If an
By default, a greedy consensus consensus is calculated. The majority-rule tree can be obtained by using proportion=0.5, and the strict consensus using proportion=1.
By default, all input networks have equal weight: the support for a feature (blob, circular order, hybrid clade, bipartition) is the proportion of networks with this feature. To give networks unequal weights, a netweights vector can be provided. There should be as many weights as there are input networks.
See consensus_level1network_save to save the output.
PhyloSummaries.consensus_level1network_save — Function
consensus_level1network_save(result_object, rootname=nothing)Write to files the results of consensus_level1network to 4 files, with file names starting with rootname and ending with "net.nwk" for the consensus network (with bipartition support as non-redundant with blobs), "blob.csv" for the table about blobs and their support, "hybrid.csv" for the table about hybrid clusters and their support, and "bipartition.csv" for the table about bipartitions and their support (as non-redundant with blobs).
For the hybrid and bipartition tables, the column for the edge number is not saved because re-reading the network from the newick file would most likely lead to different internal edge numbers, and using obsolete edge numbers could then cause unintended errors. Node numbers will also be different when re-reading the network from the newick file, but the original node numbers can be recovered with resetnodenumbers_fromnames!
input:
result_objectshould be 1 object, as output byconsensus_level1network: named tuple including the keys:blob_table,:hybrid_tableand:bipartition_table.rootname: string, all output files will start with this.
PhyloSummaries.consensus_treeofblobs — Method
consensus_treeofblobs(networks; proportion=0,
minimumblobdegree=4, netweights=nothing)Consensus tree summarizing the partitions of "interesting" blobs (nodes in the tree of blobs) and the non-redundant bipartitions (cut-edges connecting non-interesting blobs) that are shared by more than the required proportion of input trees. An error is thrown if the list of input networks is empty, or if the input networks do not all have the same tip labels.
This tree is to be considered as unrooted. It is built arbitrarily rooted at the last alphabetical taxon. Use rootatnode! or rootonedge! to re-root this tree given external knowledge of the outgroup (taxon or clade).
The support for a blob is the proportion of input networks that have a blob with this partition. This is stored in the corresponding node's .fvalue. The support for a bipartition as non-redundant is the proportion of input networks that have this bipartition not adjacent to any interesting blob. This is stored in the corresponding edge's field .y. With option supportaslength=true, this is also stored in the edge's .length. This option is not recommended and may be removed.
By default, all input networks have equal weight: the support for a feature (blob or bipartition) is the proportion of networks with this feature. Optionally, a vector of netweights can be provided, to give networks unequal weights. There should be as many weights as there are input networks.
An "interesting" blob in an input network N is a non-trivial blob (with at least one hybrid node) of degree m ≥ 4 by default. The degree of a blob is the number of cut edges it is adjacent to, and also the degree of the associated node in N's tree of blobs. Setting minimumblobdegree to 3 will cause non-trivial blobs to be considered "interesting" even if their corresponding node in N's tree of blobs is of degree 3.
Note that a node of degree 4 or more in the network's tree of blob may correspond to a polytomy in N: a single node incident to m cut-edges, but without any reticulation. These blobs are considered "non-interesting". A cut-edge incident to such a polytomy is then non-redundant, if the other blob it connects to is also non-interesting.
A chain of 2-blob leads to multiple cut-edge sharing the same bipartition. This bipartition is counted only once (if non-trivial and non-redundant) as if 2-blobs had been suppressed in the input network.
By default, a greedy consensus consensus is calculated. The majority-rule tree can be obtained by using proportion=0.5, and the strict consensus using proportion=1.
See also: consensus_level1network, count_blobpartitions!
PhyloSummaries.consensustree — Method
consensustree(trees::AbstractVector{PN.HybridNetwork};
rooted=false,
proportion=0,
supportaslength=false)Consensus tree summarizing the bipartitions (or clades) shared by more than the required proportion of input trees. An ArgumentError is thrown if one input network is not a tree, or the list of input trees is empty, or if the input trees do not all have the same tip labels. Input trees are not modified.
Output: consensus tree as an object of type HybridNetwork.
The bipartition (or clade) support values are stored
- as edge length with option
supportaslength=true(not by default). - in the field
.yof each internal edge (as external edges correspond to trivial bipartitions, which must necessarily be in all sampled trees). This field is used bywritenewickto write edge support, with its optionsupport=true. However, this.yfield is internal, so it can be modified by other functions and should not be relied upon.
By default, input trees are considered unrooted, and bipartitions are considered. Use rooted=true to consider all input trees as rooted, in which case clades (rather than bipartitions) are used to build the output rooted consensus tree.
By default, the greedy consensus consensus is calculated: the tree is built from the bipartitions (or clades) with the highest support, until no more can be added. The majority-rule tree can be obtained by using proportion=0.5: it is built only from bipartitions (or clades) present in more than 50% of the input trees.
assumptions and warnings:
- Input trees are assumed to have their edges correctly directed. If unsure, run
directedges!.(trees)prior. - Input trees should not have degree-2 nodes other than the root (nodes with 1 only parent and 1 child). If unsure, run
removedegree2nodes!.(trees, true))to keep their root even of degree 2 orremovedegree2nodes!.(trees, false))unroot them also.
example
julia> nwk = ["((c,d),((a1,a2),b));", "(((a2,a1),b),c,d);", "(((a1,a2),c),d,b);"];
julia> treesample = readnewick.(nwk);
julia> con = consensustree(treesample); writenewick(con, round=true, support=true)
"(c,d,(b,(a1,a2)::1.0)::0.667);"
julia> con = consensustree(treesample; rooted=true); # greedy consensus
julia> writenewick(con, round=true, support=true)
"((d,c)::0.333,(b,(a1,a2)::1.0)::0.667);"
julia> [e.number => round(e.y, digits=3) for e in con.edge if !isexternal(e)] # edge number -> support
3-element Vector{Pair{Int64, Float64}}:
6 => 0.333
7 => 0.667
8 => 1.0
julia> con = consensustree(treesample; rooted=true, proportion=0.5); # majority-rule
julia> writenewick(con, round=true, support=true)
"(c,d,(b,(a1,a2)::1.0)::0.667);"
julia> con = consensustree(treesample; proportion=0.75, supportaslength=true) |> writenewick
"(b,c,d,(a2,a1):1.0);"PhyloSummaries.edgenumbers_fromnodenumbers — Method
edgenumbers_fromnodenumbers(n1_nums::Vector, n2_nums::Vector, net::PN.HybridNetwork)Vector of edge numbers, of edges in net between pairs of nodes of specified numbers. An error is thrown if, in one pair, the two nodes are not adjacent. This is to avoid unknowingly using wrong edge numbers for the other pairs of nodes.
Uses PhyloNetworks.getconnectingedge.
PhyloSummaries.edgenumbers_fromnodenumbers — Method
edgenumbers_fromnodenumbers(table, net::PN.HybridNetwork)Vector of edge numbers in net, with element i corresponding to the edge connecting (incident to) the two nodes listed in row i of the input table. This table may be a data frame or a named tuple, such as produced by consensus_treeofblobs and consensus_level1network. It should have 2 columns with node numbers, either named :node_from and :node_to, or :node1 and :node2.
PhyloSummaries.resetnodenumbers_fromnames! — Method
resetnodenumbers_fromnames!(net)Reset node numbers so that:
- leaves are numbered 1 through the number of taxa, in alphabetical order of taxon names
- any internal node named
_n*is numberedn, which should be an integer that is not starting with 0, and*is what comes after - any hybrid node named
Hnis numberedn - any node with an empty name is given some other number, so that node numbers are unique.
An error occurs if an internal node has a name that cannot be parsed as described above, or results in a number that is ≤ the number of taxa, or results in duplicated numbers.
See also: PhyloNetworks.resetedgenumbers! and PhyloNetworks.resetnodenumbers!.
example
julia> net = readnewick("((t2,(t1,(t4,#H11)_10))_8,(t3)#H11)_7_blob1;");
julia> printnodes(net)
node leaf hybrid name i_cycle edges'numbers
1 true false t2 -1 1
2 true false t1 -1 2
3 true false t4 -1 3
5 false false _10 -1 3 4 5
-4 false false -1 2 5 6
6 false false _8 -1 1 6 7
7 true false t3 -1 8
4 false true H11 -1 8 4 9
-2 false false _7_blob1 -1 7 9
julia> resetnodenumbers_fromnames!(net);
julia> printnodes(net)
node leaf hybrid name i_cycle edges'numbers
2 true false t2 -1 1
1 true false t1 -1 2
4 true false t4 -1 3
10 false false _10 -1 3 4 5
5 false false -1 2 5 6
8 false false _8 -1 1 6 7
3 true false t3 -1 8
11 false true H11 -1 8 4 9
7 false false _7_blob1 -1 7 9 index
PhyloSummaries.blobpartitions_supportPhyloSummaries.consensus_level1networkPhyloSummaries.consensus_level1network_savePhyloSummaries.consensus_treeofblobsPhyloSummaries.consensustreePhyloSummaries.edgenumbers_fromnodenumbersPhyloSummaries.edgenumbers_fromnodenumbersPhyloSummaries.resetnodenumbers_fromnames!