internal documentation
Documentation for PhyloTraits's internal functions. These functions are not exported and their access (API) should not be considered stable. But they can still be used, like this for example: PhyloTraits.foo() for a function named foo().
functions & types
PhyloTraits.BM — Type
BM(λ)Brownian Motion, subtype of ContinuousTraitEM, to model the population mean of a trait (or of the residuals from a linear model). Under the BM model, the population (or species) means have a multivariate normal distribution with covariance matrix = σ²λV, where σ² is the between-species variance-rate (to be estimated), and the matrix V is obtained from PhyloNetworks.sharedpathmatrix(net)[:tips].
λ is set to 1 by default, and is immutable. In future versions, λ may be used to control the scale for σ².
On a tree, V is the length of shared ancestry. On a network, the BM model assumes that the trait at a hybrid node is the weighted average of its immediate parents (plus possibly a fixed shift). The weights are the proportion of genes inherited from each parent: the γ parameters of hybrid edges.
PhyloTraits.ContinuousTraitEM — Type
ContinuousTraitEMAbstract type for evolutionary models for continuous traits, using a continuous-time stochastic process on a phylogeny.
For subtypes, see BM, PagelLambda, ScalingHybrid, GaussianCoalescent.
Each of these subtypes/models has the field lambda, whose default value is 1.0. However, the interpretation of this field differs across models.
PhyloTraits.GaussianCoalescent — Type
GaussianCoalescent(v0,σ²,Ne,lambda)ContinuousTraitEM type to model the evolution of a polygenic trait X with variance v0 at the root population, and variance rate σ². The model assumes that X is controlled by a large number L of loci, each locus l having an infinitesimally small effect Yl on the trait, acting additively across loci: X = (sum_l Yl)/√L. The normalization by √L is to simplify notations: the effect of locus l is Yl/√L, and becomes infinitesimally small as L grows to infinity.
v0 is the variance of each Yl, and of X in the root population. Each Yl evolves along each own gene tree, arising from the population phylogeny according to the network multispecies coalescent process. Conditional on its gene tree, Yl evolves according to a centered process with increments of variance σ²t over t generations, such as a Brownian motion process, a compound Poisson process, or a variance-gamma process.
From the Central Limit Theorem, X = (sum_l Yl)/√L converges to a Gaussian when L goes to infinity. This model assumes that the polygenic trait X is Gaussian, with a variance-covariance structure given by the process described above, and computed with function gaussiancoalescent_covariancematrix. The model prescribes both the covariances between populations (or species), and within-populations.
Ne is the haploid effective population size (that is, 2N in the standard case of autosomal loci in diploid organisms). Setting Ne=1 corresponds to using branch lengths in coalescent units in the species phylogeny, and interpreting σ² as a variance rate per coalescent unit.
Ne and σ² are assumed constant across populations.
The process can be parametrized using λ = v0/(Ne σ²), which is equal to 1 if the root population is at equilibrium.
PhyloTraits.NucleicAcidSubstitutionModel — Type
NucleicAcidSubstitutionModelAdapted from SubstitutionModels.jl in BioJulia. The same Q and P function names are used for the transition rates and probabilities.
PhyloTraits.OptSummary — Type
OptSummary{T<:AbstractFloat}Summary of an NLopt optimization. Idea and code taken from MixedModels. T is the type of the function argument(s) and of the function value.
PhyloTraits.PagelLambda — Type
PagelLambda(λ)Pagel's λ model, subtype of ContinuousTraitEM, with covariance matrix σ²V(λ). σ² is the between-species variance-rate (to be estimated), and V(λ) = λV + (1-λ)T, where V is the covariance under a Brownian motion BM and T is a diagonal matrix containing the total branch length elapsed from the root to each leaf (if the phylogeny is a tree, or more generally if the network is time consistent: the time from the root to a given node does not depend on the path).
λ ∈ [0,1] is mutable and may be optimized. It is a measure of phylogenetic signal, that is, how important the given network is for explaining variation in the response. When λ=1, the PagelLambda model reduces to the BM model.
PhyloTraits.ScalingHybrid — Type
ScalingHybrid(λ)Scaling Hybrid model, subtype of ContinuousTraitEM, with covariance matrix σ²V(N(λ)). σ² is the between-species variance-rate (to be estimated), V(N) is the Brownian motion BM covariance obtained from network N, and N(λ) is a obtained from the input network by rescaling the inheritance parameter γ of all minor edges by the same λ: a minor edge has its original γ changed to λγ, using the same λ at all reticulations. Note that for a major edge with original inheritance γ, the partner minor edge has inheritance γminor = 1-γ, so the major edge's inheritance is changed to 1-λγminor = λγ+1-λ. Warning: this model assumes that all hybrid nodes are bicombining, that is, they have exactly 2 parents each.
For more information: see Bastide (2017) dissertation, section 4.3.2 p.175, available at here.
λ ∈ [0,λmax] is mutable and may be optimized. It is a measure of how important the reticulations are for explaining variation in the response.
- When λ = 1, the
ScalingHybridmodel reduces to theBMmodel on the original network N. - λ = 0 corresponds to the BM model on N's major tree (after removing all minor hybrid edges). Please note that testing the null hypothesis "λ = 0" would require comparing the likelihood ratio statistic to a null distribution that is not a chi-square distribution with 1 degree freedom, because 0 is at the boundary of allowable λ values. A mixture of 2 chi-square distributions might be appropriate (see for example Mitchell, Allman & Rhodes 2019 and references therein).
- λmax = 1/maximum(γminor) ≥ 1. It is the maximum λ value such that λγminor ≤ 1 at all hybrid nodes.
PhyloTraits.StatisticalSubstitutionModel — Type
StatisticalSubstitutionModel(
model::SubstitutionModel,
ratemodel::RateVariationAcrossSites,
net::HybridNetwork,
trait::AbstractVector,
siteweight::Union{Nothing, Vector{Float64}}=nothing,
maxhybrid::Int=length(net.hybrid)
)Inner constructor. Makes a deep copy of the input model, rate model. Warning: does not make a deep copy of the network: modification of the object.net would modify the input net. Assumes that the network has valid gamma values (to extract displayed trees).
StatisticalSubstitutionModel(
net::HybridNetwork,
fastafile::String,
modsymbol::Symbol,
rvsymbol::Symbol=:noRV,
ratecategories::Int=4;
maxhybrid::Int=length(net.hybrid)
)Constructor from a network and a fasta file. The model symbol should be one of :JC69, :HKY85, :ERSM or :BTSM. The rvsymbol should be as required by RateVariationAcrossSites.
The network's gamma values are modified if they are missing. After that, a deep copy of the network is passed to the inner constructor.
PhyloTraits.StatisticalSubstitutionModel — Type
StatisticalSubstitutionModelSubtype of StatsBase.StatisticalModel, to fit discrete data to a model of trait substitution along a network. See fitdiscrete to fit a trait substitution model to discrete data. It returns an object of type StatisticalSubstitutionModel, to which standard functions can be applied, like loglikelihood(object), aic(object) etc.
PhyloTraits.SubstitutionModel — Type
SubstitutionModelAbstract type for substitution models, using a continous time Markov model on a phylogeny. Adapted from SubstitutionModels.jl in BioJulia.
For variable rates, see RateVariationAcrossSites
For sub types, see NucleicAcidSubstitutionModel, TraitSubstitutionModel
All these models are supposed to have fields rate and eigeninfo.
PhyloTraits.WithinSpeciesCTM — Type
WithinSpeciesCTMType to fit models accounting for within-species variation, including measurement error, genetic variation between individuals, plasticity, environmental variation etc. CTM stands for "continuous trait model". Contains the estimated variance components (between-species phylogenetic variance rate and within-species variance) and output from the NLopt optimization used in the estimation.
fields
wsp_var: intra/within-species variance.bsp_var: inter/between-species variance-rate.wsp_ninv: vector of the inverse sample-sizes (e.g. [1/n₁, ..., 1/nₖ], where data from k species was used to fit the model and nᵢ is the no. of observations for the ith species).rss: within-species sum of squaresoptsum: anOptSummaryobject.
Base.getindex — Function
getindex(obj::TraitSimulation, d::Symbol, w::Symbol)Getting submatrices of an object of type TraitSimulation.
Arguments
obj::TraitSimulation: the matrix from which to extract.d: symbol specifying which sub-matrix to extract. Can be::tipscolumns and/or rows corresponding to the tips:internalnodescolumns and/or rows corresponding to the internal nodes
w: symbol specifying whether simulated (:sim) or mean expected (:exp) values are desired.
PhyloTraits.P! — Method
P!(Pmat::AbstractMatrix, obj::SM, t::Float64)Fill in the input matrix Pmat with the transition rates to go from each state to another in time t, according to rates in Q. see also: P.
julia> m1 = BinaryTraitSubstitutionModel([1.0,2.0], ["low","high"])
Binary Trait Substitution Model:
rate low→high α=1.0
rate high→low β=2.0
julia> PhyloTraits.P!(Matrix{Float64}(undef,2,2), m1, 0.3) # fills an uninitialized 2x2 matrix of floats
2×2 Matrix{Float64}:
0.80219 0.19781
0.39562 0.60438
julia> m2 = JC69([1.]);
julia> PhyloTraits.P!(Matrix{Float64}(undef,4,4), m2, 0.2)
4×4 Matrix{Float64}:
0.824446 0.0585179 0.0585179 0.0585179
0.0585179 0.824446 0.0585179 0.0585179
0.0585179 0.0585179 0.824446 0.0585179
0.0585179 0.0585179 0.0585179 0.824446 PhyloTraits.P — Method
P(obj, t)Probability transition matrix for a SubstitutionModel, of the form
P[1,1] ... P[1,k]
. .
. .
P[k,1] ... P[k,k]where P[i,j] is the probability of ending in state j after time t, given that the process started in state i. see also: P!.
HKY example:
julia> m1 = HKY85([0.5], [0.20, 0.30, 0.30, 0.20])
HKY85 Substitution Model base frequencies: [0.2, 0.3, 0.3, 0.2]
relative rate version with transition/tranversion ratio kappa = 0.5,
scaled so that there is one substitution per unit time
rate matrix Q:
A C G T
A * 0.4839 0.2419 0.3226
C 0.3226 * 0.4839 0.1613
G 0.1613 0.4839 * 0.3226
T 0.3226 0.2419 0.4839 *
julia> PhyloTraits.P(m1, 0.2)
4×4 StaticArraysCore.MMatrix{4, 4, Float64, 16} with indices SOneTo(4)×SOneTo(4):
0.81592 0.0827167 0.0462192 0.0551445
0.0551445 0.831326 0.0827167 0.0308128
0.0308128 0.0827167 0.831326 0.0551445
0.0551445 0.0462192 0.0827167 0.81592 Juke-Cantor example:
julia> m1 = JC69([1.]);
julia> PhyloTraits.P(m1, 0.2)
4×4 StaticArraysCore.MMatrix{4, 4, Float64, 16} with indices SOneTo(4)×SOneTo(4):
0.824446 0.0585179 0.0585179 0.0585179
0.0585179 0.824446 0.0585179 0.0585179
0.0585179 0.0585179 0.824446 0.0585179
0.0585179 0.0585179 0.0585179 0.824446 PhyloTraits.anova — Method
anova(objs::PhyloNetworkLinearModel...)Takes several nested fits of the same data, and computes the F statistic for each pair of models.
The fits must be results of function phylolm called on the same data, for models that have more and more effects.
Returns a DataFrame object with the anova table.
PhyloTraits.check_matchtaxonnames! — Method
check_matchtaxonnames!(species, data, net; kwargs...)Modify species and dat by removing the species (rows) absent from the network. Return a new network (net is not modified) with tips matching those in species: if some species in net have no data, these species are pruned from the network. The network also has its node names reset, such that leaves have nodes have consecutive numbers starting at 1, with leaves first.
The optional keyword argument suppresswarnings, which is false by default, can be set to true to suppress warnings about taxa in the network without data.
Used by fitdiscrete to build a new StatisticalSubstitutionModel.
PhyloTraits.defaultsubstitutionmodel — Function
defaultsubstitutionmodel(network, modsymbol::Symbol, data::DataFrame,
siteweights::Vector)Return a statistical substitution model (SSM) with appropriate state labels and a rate appropriate for the branch lengths in net (see startingrate). The data frame must have the actual trait/site data in columns 2 and up, as when the species names are in column 1. For DNA data, the relative rate model is returned, with a stationary distribution equal to the empirical frequencies.
PhyloTraits.discrete_backwardlikelihood_trait! — Method
discrete_backwardlikelihood_trait!(obj::SSM, tree::Integer, ri::Integer)Update and return the backward likelihood (last argument backwardlik) assuming rate category ri and tree index tree, using current forward and backwards likelihoods in obj: these depend on the trait (or site) given to the last call to discrete_corelikelihood_trait!. Used by ancestralreconstruction.
warning: assume correct transition probabilities.
PhyloTraits.discrete_corelikelihood! — Method
discrete_corelikelihood!(obj::StatisticalSubstitutionModel;
whichtrait::AbstractVector{Int} = 1:obj.nsites)Calculate the likelihood and update obj.loglik for discrete characters on a network, calling discrete_corelikelihood_trait!. The algorithm extracts all displayed trees and weighs the likelihood under all these trees. The object's partial likelihoods are updated:
- forward and direct partial likelihoods are re-used, one trait at a time,
- overall likelihoods on each displayed tree, given each rate category and for each given site/trait: are cached in
_loglikcache.
PhyloTraits.discrete_corelikelihood_trait! — Method
discrete_corelikelihood_trait!(obj::SSM, t::Integer, ci::Integer, ri::Integer)Return the likelihood for tree t, trait (character/site) index ci and rate category ri. Update & modify the forward & directional log-likelihoods obj.forwardlik and obj.directlik, which are indexed by [state, nodenumber or edgenumber]. Used by discrete_corelikelihood!.
Preconditions: obj.logtrans updated, edges directed, nodes/edges preordered
PhyloTraits.formatinterval — Function
formatinterval(obj::ReconstructedStates, pred::DataFrame, withexpectation::Bool=false, digits::Int=2)Format the prediction intervals for the plotting function. If withexpectation is set to true, then the best predicted value is also shown along with the interval.
PhyloTraits.gaussiancoalescent_covariancematrix — Method
gaussiancoalescent_covariancematrix(
net::HybridNetwork,
lambda::Real;
checkedges::Bool=true,
preorder::Bool=true)[cM,eV] of MatrixTopologicalOrder objects, where cM is the matrix of (co)variances of population means, and eV is the expected within-population variance, under the Gaussian model with coalescence, under: σ2=1 per coalescent unit and variance v0 = λ*σ2 within the root population.
For a general value of σ2, all (co)variances simply need to be multiplied by σ2. λ=1 corresponds to the root population being at equilibrium (after evolving under this model for a long time, the within-population variance is ≈ σ2).
Warning: assumes that edge lengths in net are in coalescent units (number of generations / haploid effective population size), and that σ2 is the variance rate per coalescent unit. σ2 is also the within-species variance at equilibrium.
PhyloTraits.getGammas — Method
getGammas(net)Vector of inheritance γ's of all major edges (tree edges and major hybrid edges), ordered according to the pre-order index of their child node, assuming this pre-order is already calculated (with up-to-date field vec_node). Here, a "major" edge is an edge with field ismajor set to true, regardless of its actual γ (below, at or above 0.5).
See setGammas!
PhyloTraits.getmissingtipmarks — Function
getmissingtipmarks(obj::ReconstructedStates, missingmark::AbstractString="*")Create a vector of string, with a missingmark for tips that are missing.
PhyloTraits.getparameters — Method
getparameters(obj::RateVariationAcrossSites)Return a copy of the alpha and/or pinv parameters of model obj, in a single vector.
PhyloTraits.getparamindex — Method
getparamindex(obj::RateVariationAcrossSites)Indices of parameters in (p_invariable, alpha).
PhyloTraits.getpredint — Method
getpredint(obj::ReconstructedStates; level::Real=0.95)Prediction intervals with level level for internal nodes and missing tips.
PhyloTraits.lambda! — Method
lambda!(m::PhyloNetworkLinearModel, newlambda)
lambda!(m::ContinuousTraitEM, newlambda)Assign a new value to the lambda parameter.
PhyloTraits.lambda! — Method
For a GaussianCoalescent model, assign new λ value, keeping σ2 fixed and adjusting v0 = new λ * (Ne σ2)
PhyloTraits.lambda — Method
lambda(m::PhyloNetworkLinearModel)
lambda(m::ContinuousTraitEM)Value assigned to the lambda parameter, if appropriate.
PhyloTraits.learnlabels — Method
learnlabels(model::Symbol, dat::DataFrame)Return unique non-missing values in dat, and check that these labels can be used to construct of substitution model of type model.
examples:
julia> using DataFrames
julia> dat = DataFrame(trait1 = ["A", "C", "A", missing]); # 4×1 DataFrame
julia> PhyloTraits.learnlabels(:BTSM, dat)
2-element Vector{String}:
"A"
"C"
julia> PhyloTraits.learnlabels(:JC69, dat)
2-element Vector{String}:
"A"
"C"PhyloTraits.posterior_loghybridweight — Function
posterior_loghybridweight(obj::SSM, hybrid_name, trait = 1)
posterior_loghybridweight(obj::SSM, edge_number, trait = 1)Log-posterior probability for all trees displaying the minor parent edge of hybrid node named hybrid_name, or displaying the edge number edge_number. That is: log of P(hybrid minor parent | trait) if a single trait is requested, or A[i]= log of P(hybrid minor parent | trait i) if trait is a vector or range (e.g. trait = 1:obj.nsites). These probabilities are conditional on the model parameters in obj.
Precondition: _loglikcache updated by discrete_corelikelihood!
examples
julia> net = readnewick("(((A:2.0,(B:1.0)#H1:0.1::0.9):1.5,(C:0.6,#H1:1.0::0.1):1.0):0.5,D:2.0);");
julia> m1 = BinaryTraitSubstitutionModel([0.1, 0.1], ["lo", "hi"]); # arbitrary rates
julia> using DataFrames
julia> dat = DataFrame(species=["C","A","B","D"], trait=["hi","lo","lo","hi"]);
julia> fit = fitdiscrete(net, m1, dat); # optimized rates: α=0.27 and β=0.35
julia> plhw = PhyloTraits.posterior_loghybridweight(fit, "H1");
julia> round(exp(plhw), digits=5) # posterior probability of going through minor hybrid edge
0.08017
julia> hn = net.node[3]; getparentedgeminor(hn).gamma # prior probability
0.1PhyloTraits.posterior_logtreeweight — Function
posterior_logtreeweight(obj::SSM, trait = 1)Array A of log-posterior probabilities for each tree displayed in the network: such that A[t] = log of P(tree t | trait trait) if a single trait is requested, or A[t,i]= log of P(tree t | trait i) if trait is a vector or range (e.g. trait = 1:obj.nsites). These probabilities are conditional on the model parameters in obj.
Displayed trees are listed in the order in which they are stored in the fitted model object obj.
Precondition: _loglikcache updated by discrete_corelikelihood!
examples
julia> net = readnewick("(((A:2.0,(B:1.0)#H1:0.1::0.9):1.5,(C:0.6,#H1:1.0::0.1):1.0):0.5,D:2.0);");
julia> m1 = BinaryTraitSubstitutionModel([0.1, 0.1], ["lo", "hi"]); # arbitrary rates
julia> using DataFrames
julia> dat = DataFrame(species=["C","A","B","D"], trait=["hi","lo","lo","hi"]);
julia> fit = fitdiscrete(net, m1, dat); # optimized rates: α=0.27 and β=0.35
julia> pltw = PhyloTraits.posterior_logtreeweight(fit);
julia> round.(exp.(pltw), digits=5) # posterior trees probabilities (sum up to 1)
2-element Vector{Float64}:
0.91983
0.08017
julia> round.(exp.(fit.priorltw), digits=4) # the prior tree probabilities are similar here (tiny data set!)
2-element Vector{Float64}:
0.9
0.1PhyloTraits.setGammas! — Method
setGammas!(net, γ vector)Set inheritance γ's of hybrid edges, using input vector for major edges. Assume pre-order calculated already, with up-to-date field vec_node. See getGammas.
Warning: very different from PhyloNetworks.setgamma!, which focuses on a single hybrid event, updates the field ismajor according to the new γ, and is not used here.
Assumption: each hybrid node has only 2 parents, a major and a minor parent (according to the edges' field ismajor).
PhyloTraits.setNe! — Method
For a GaussianCoalescent model, assign new Ne value, keeping v0 fixed and adjusting σ2 = old σ2 * old Ne / new Ne to maintain the same equilibrium within-population variance and same λ.
PhyloTraits.setalpha! — Method
setalpha!(obj, alpha)Set the shape parameter alpha in a RateVariationAcrossSites model obj, and update the rate multipliers accordingly. Return the modified object.
PhyloTraits.seteigeninfo! — Method
for a [BinaryTraitSubstitutionModel]: store eigenvalue (q01+q10) and stationary distribution
PhyloTraits.seteigeninfo! — Method
for a [EqualRatesSubstitutionModel]: store lambda = k/(k-1), where k is the number of states
PhyloTraits.seteigeninfo! — Method
for HKY85: store piR, piY, the 2 non-zero eigenvalues and a scaling factor
PhyloTraits.seteigeninfo! — Method
for JC69: store lambda = 4/3 (if relative) or rate * 4/3 (absolute).
PhyloTraits.seteigeninfo! — Method
seteigeninfo!(obj)Calculate eigenvalue & eigenfector information for a substitution model (SM) object (as needed to calculate transition rate matrices) and store this info within the object.
PhyloTraits.setparameters! — Method
setparameters!(obj::RateVariationAcrossSites, par::AbstractVector)Set the values of the alpha and/or pinv parameters of model obj. See also setalpha!, setpinv! and setpinvalpha!
PhyloTraits.setpinv! — Method
setpinv!(obj, pinv)Set the proportion of invariable sites pinv in a RateVariationAcrossSites model obj, and update the rate multipliers & weights accordingly. For RVASInvGamma objects, the original rate multipliers are assumed correct, according to the original pinv value. Return the modified object.
PhyloTraits.setpinvalpha! — Method
setpinvalpha!(obj, pinv, alpha)Set the proportion of invariable sites pinv and the alpha parameter for the discretized gamma distribution in a model obj of type RVASGammaInv{S}. Update the rate multipliers & weights accordingly. The mean of the distribution is constrained to 1.
Return the modified object.
PhyloTraits.setrates! — Method
setrates!(model, rates)update rates then call seteigeninfo! to update a model's eigeninfo
PhyloTraits.showQ — Method
showQ(IO, model)Print the Q matrix to the screen, with trait states as labels on rows and columns. adapted from prettyprint function by mcreel, found 2017/10 at https://discourse.julialang.org/t/display-of-arrays-with-row-and-column-names/1961/6
PhyloTraits.showdata — Function
showdata(io::IO, obj::SSM, fullsiteinfo::Bool=false)Return information about the data in an SSM object: number of species, number or traits or sites, number of distinct patterns, and more information if fullsiteinfo is true: number sites with missing data only, number of invariant sites, number of sites with 2 distinct states, number of parsimony-informative sites (with 2+ states being observed in 2+ tips), number of sites with some missing data, and overall proportion of entries with missing data.
Note: Missing is not considered an additional state. For example, if a site contains some missing data, but all non-missing values take the same state, then this site is counted in the category "invariant".
PhyloTraits.startingrate — Method
startingrate(net)Estimate an evolutionary rate appropriate for the branch lengths in the network, which should be a good starting value before optimization in fitdiscrete, assuming approximately 1 change across the entire tree. If all edge lengths are missing, set starting rate to 1/(number of taxa).
PhyloTraits.traitlabels2indices — Method
traitlabels2indices(data, model::SubstitutionModel)Check that the character states in data are compatible with (i.e. subset of) the trait labels in model. All columns are used. data can be a DataFrame or a Matrix (multiple traits), or a Vector (one trait). Return a vector of vectors (one per species) with integer entries, where each state (label) is replaced by its index in model. For DNA data, any ambiguous site is treated as missing.
PhyloTraits.update_logtrans — Method
update_logtrans(obj::SSM, edge::Edge)Update the log-transition probabilities associates to one particular edge in the network.
PhyloTraits.update_logtrans — Method
update_logtrans(obj::SSM)Initialize and update obj.logtrans, the log transition probabilities along each edge in the full network. They are re-used for each displayed tree, which is why edges are not fused around degree-2 nodes when extracting displayed trees.
StatsAPI.coeftable — Method
coeftable(m::PhyloNetworkLinearModel; level::Real=0.95)Return coefficient estimates, standard errors, t-values, p-values, and t-intervals as a StatsBase.CoefTable.
StatsAPI.confint — Method
confint(m::PhyloNetworkLinearModel; level::Real=0.95)Return confidence intervals for coefficients, with confidence level level, based on the t-distribution whose degree of freedom is determined by the number of species (as returned by dof_residual)
StatsAPI.deviance — Function
StatsBase.deviance(m::PhyloNetworkLinearModel)-2 loglikelihood of the fitted model. See also loglikelihood.
Note: this is not the residual-sum-of-squares deviance as output by GLM, such as one would get with deviance(m.model).
StatsAPI.deviance — Method
StatsBase.deviance(m::PhyloNetworkLinearModel, Val(true))Residual sum of squares with metric V, the estimated phylogenetic covariance, if the model is appropriate.
StatsAPI.fit — Method
fit(StatisticalSubstitutionModel, net, model, traits; kwargs...)
fit!(StatisticalSubstitutionModel; kwargs...)Internal function called by fitdiscrete: with same key word arguments kwargs. But dangerous: traits should be a vector of vectors as for fitdiscrete but here traits need to contain the indices of trait values corresponding to the indices in getlabels(model), and species should appear in traits in the order corresponding to the node numbers in net. See traitlabels2indices to convert trait labels to trait indices.
Warning: does not perform checks. fitdiscrete calls this function after doing checks, preordering nodes in the network, making sure nodes have consecutive numbers, species are matched between data and network etc.
StatsAPI.loglikelihood — Method
loglikelihood(m::PhyloNetworkLinearModel)Log likelihood or log restricted likelihood (REML) depending on m.reml, of the fitted model.
For models with no within-species variation, the likelihood (or REML) is calculated based on the joint density for species-level mean responses.
For within-species variation models, the likelihood is calculated based on the joint density for individual-level responses. This can be calculated from individual-level data, but also by providing species-level means and standard deviations which is accepted by phylolm.
Warning: many summaries are based on the species-level model, like "dof_residual", "residuals", "predict" or "deviance". So deviance is innapropriate to compare models with within-species variation. Use loglikelihood to compare models based on data at the individual level.
Reminder: do not compare ML or REML values across models fit on different data. Do not compare REML values across models that do not have the same predictors (fixed effects): use ML instead, for that purpose.
StatsAPI.nobs — Method
StatsBase.nobs(m::PhyloNetworkLinearModel)Number of observations: number of species with data, if the model assumes known species means, and number of individuals with data, if the model accounts for within-species variation.
StatsAPI.nulldeviance — Method
StatsBase.nulldeviance(m::PhyloNetworkLinearModel)For appropriate phylogenetic linear models, the deviance of the null model is the total sum of square with respect to the metric V, the estimated phylogenetic covariance matrix.
StatsAPI.predict — Method
predict(
obj::ReconstructedStates;
interval::Union{Symbol,Nothing}=nothing,
level::Real=0.95,
text::Bool=false,
digits::Int=2,
missingmark::AbstractString="*",
combine::Bool=false
)Estimated reconstructed states and, if interval=:prediction, prediction intervals with level level at all internal nodes and tips with missing traits.
If text=true, the prediction and intervals are formated as string for easy plotting with the nodelabel argument to plot from package PhyloPlots. In that case, digits controls the number of digits shown, missingmark adds a distinctive mark to the prediction of tips with missing data, (set to missingmark="" for no mark), and if combine=true, the prediction and bound of the intervals are combined into a single text string.
StatsAPI.stderror — Method
stderror(m::PhyloNetworkLinearModel)Return the standard errors of the coefficient estimates. See vcov for related information on how these are computed.
StatsAPI.vcov — Method
vcov(m::PhyloNetworkLinearModel)Return the variance-covariance matrix of the coefficient estimates.
For the continuous trait evolutionary models currently implemented, species-level mean response (conditional on the predictors), Y|X is modeled as:
- Y|X ∼ 𝒩(Xβ, σ²ₛV) for models assuming known species mean values (no within-species variation)
- Y|X ∼ 𝒩(Xβ, σ²ₛV + σ²ₑD⁻¹) for models with information from multiple individuals and assuming within-species variation
The matrix V is inferred from the phylogeny, but may also depend on additional parameters to be estimated (e.g. lambda for Pagel's Lambda model). See ContinuousTraitEM, PhyloNetworkLinearModel for more details.
If (1), then return σ²ₛ(X'V⁻¹X)⁻¹, where σ²ₛ is estimated with REML, even if the model was fitted with reml=false. This follows the conventions of nlme::gls and stats::glm in R.
If (2), then return σ²ₛ(X'W⁻¹X)⁻¹, where W = V+(σ²ₑ/σ²ₛ)D⁻¹ is estimated, and σ²ₛ & σₑ are the estimates obtained with ML or REML, depending on the reml option used to fit the model m. This follows the convention of MixedModels.fit in Julia.
StatsModels.isnested — Method
isnested(m1::PhyloNetworkLinearModel, m2::PhyloNetworkLinearModel)
isnested(m1::ContinuousTraitEM, m2::ContinuousTraitEM)True if m1 is nested in m2, false otherwise. Models fitted with different criteria (ML and REML) are not nested. Models with different predictors (fixed effects) must be fitted with ML to be considered nested.
index
PhyloTraits.BMPhyloTraits.ContinuousTraitEMPhyloTraits.GaussianCoalescentPhyloTraits.NucleicAcidSubstitutionModelPhyloTraits.OptSummaryPhyloTraits.PagelLambdaPhyloTraits.ScalingHybridPhyloTraits.StatisticalSubstitutionModelPhyloTraits.StatisticalSubstitutionModelPhyloTraits.SubstitutionModelPhyloTraits.WithinSpeciesCTMBase.getindexPhyloTraits.PPhyloTraits.P!PhyloTraits.anovaPhyloTraits.check_matchtaxonnames!PhyloTraits.defaultsubstitutionmodelPhyloTraits.discrete_backwardlikelihood_trait!PhyloTraits.discrete_corelikelihood!PhyloTraits.discrete_corelikelihood_trait!PhyloTraits.formatintervalPhyloTraits.gaussiancoalescent_covariancematrixPhyloTraits.getGammasPhyloTraits.getmissingtipmarksPhyloTraits.getparametersPhyloTraits.getparamindexPhyloTraits.getpredintPhyloTraits.lambdaPhyloTraits.lambda!PhyloTraits.lambda!PhyloTraits.learnlabelsPhyloTraits.posterior_loghybridweightPhyloTraits.posterior_logtreeweightPhyloTraits.setGammas!PhyloTraits.setNe!PhyloTraits.setalpha!PhyloTraits.seteigeninfo!PhyloTraits.seteigeninfo!PhyloTraits.seteigeninfo!PhyloTraits.seteigeninfo!PhyloTraits.seteigeninfo!PhyloTraits.setparameters!PhyloTraits.setpinv!PhyloTraits.setpinvalpha!PhyloTraits.setrates!PhyloTraits.showQPhyloTraits.showdataPhyloTraits.startingratePhyloTraits.traitlabels2indicesPhyloTraits.update_logtransPhyloTraits.update_logtransStatsAPI.coeftableStatsAPI.confintStatsAPI.devianceStatsAPI.devianceStatsAPI.fitStatsAPI.loglikelihoodStatsAPI.nobsStatsAPI.nulldevianceStatsAPI.predictStatsAPI.stderrorStatsAPI.vcovStatsModels.isnested