Chapter 4 Extract Variance Partitioning Components and Visualize Results
Variance partitioning in a multivariate framework using HMSC
is a method to decompose the variation in multiple response variables, such as the abundances of different pollinator species, into contributions from spatial, temporal, and environmental components. By partitioning variance across these components, we can identify the drivers of pollinator abundance and understand how ecological processes shape pollinator communities. By analyzing these components simultaneously across multiple species, this approach identifies the shared and species-specific drivers of community structure.
4.1 Load Packages
library(readr)
library(dplyr)
library(ggplot2)
library(reshape2)
library(tidyr)
library(Hmsc)
library(tibble)
4.2 Load HMSC Model
<- readRDS(paste0(here::here("Pollinator_Observations_HMSC_Workshop.rds"))) VP_model_time_weather
4.3 Local Pollinator Assemblage Variance Partitioning Analysis
The computeVariancePartitioning
function in HMSC
is used to partition the variance in multivariate response variables (e.g. species abundances or occurrences) across different components, typically spatial, temporal, and environmental factors. This function quantifies how much of the variation in the response variables can be attributed to each of these factors.
In a multivariate framework, you may have several explanatory variables (i.e. fixed effects) that fall under different global categories. For example:
Variables like temperature, precipitation, and humidity fall under the weather category.
Variables like habitat type, floral resources, and vegetation cover fall under the habitat category.
This structure in the data can be accounted for in HMSC
. The group
and groupNames
arguments in the computeVariancePartitioning
function are used to specify which explanatory variables belong together within a specific global category. By using group
and groupNames
, you can partition the variance across these global categories and understand how much variation in the response variables (e.g. pollinator abundance) is explained by each global category (e.g. weather vs. habitat). This allows you to understand the relative contribution of different global categories (e.g. weather, habitat) to the variation in the ecological data.
group
: This argument allows you to define which explanatory variables belong to the same category or group.
groupNames
: This argument assigns descriptive names to the groups defined in the group
argument.
4.3.1 How It Works
Here, we use group
and groupNames
to combine the explanatory variables of the HMSC
model (i.e., Time_Start_Decimal + Sky + Temperature + Wind) into two global categories. To do so, we first need to look at the design matrix X of the HMSC
model, which is constructed from the XData
and XFormula
arguments.
head(VP_model_time_weather$X)
## (Intercept) Time_Start_Decimal SkyCloudy SkyMostly_Clear TemperatureHot
## 1 1 12.50000 0 0 1
## 2 1 11.75000 0 0 1
## 3 1 14.25000 0 0 0
## 4 1 12.08333 0 0 1
## 5 1 10.70000 0 0 0
## 6 1 10.93333 0 0 0
## TemperatureWarm WindNo WindWindy
## 1 0 0 0
## 2 0 0 0
## 3 1 0 0
## 4 0 0 1
## 5 1 0 0
## 6 1 0 0
We next group the explanatory variables accordingly. The first column of the design matrix X corresponds to the intercept, which does not explain any variance so we can group this column arbitrarily. The second column corresponds to the time at which the 10-minute censuses were conducted, in decimal format (i.e. Time_Start_Decimal). We will group the first two columns of the design matrix X together. Columns four to six (Sky, Temperature, Wind) of the design matrix X all relate to a weather category and thus will be grouped together. The variance partitioning results will show how much of the total variance in the response variables (e.g. pollinator abundance) is explained by each group.
<- computeVariancePartitioning(VP_model_time_weather,
VP_model_time_weather_variance_partition group = c(rep(1, 2), rep(2, 6)),
groupnames = c("Time_Start_Decimal", "Weather"))
The following code performs several operations on the variance partitioning results from the HMSC
model (output from the computeVariancePartitioning
function) to prepare the data in a format suitable for visualization. The key elements are:
VP_model_time_weather_variance_partition$vals
: This is the result of the variance partitioning analysis. It contains the proportions of variance explained by different factors (spatial, temporal, environmental) for each pollinator functional group (e.g. bumblebees, butterflies, etc.).rownames_to_column(var = "Source_Variance")
: Row names correspond to different sources of variance (e.g. spatial, temporal, environmental) that explain variation in pollinator abundance.pivot_longer(cols = -Source_Variance, names_to = "Pollinator_Functional_Group", values_to = "Variance_Proportion")
: This is where the data is reshaped from a wide format to a long format.
<- as.data.frame(VP_model_time_weather_variance_partition$vals) %>%
data_VP_model_time_weather_variance_partition rownames_to_column(var = "Source_Variance") %>%
pivot_longer(cols = -Source_Variance,
names_to = "Pollinator_Functional_Group",
values_to = "Variance_Proportion") %>%
arrange(Pollinator_Functional_Group)
data_VP_model_time_weather_variance_partition
## # A tibble: 70 × 3
## Source_Variance Pollinator_Functional_Group Variance_Proportion
## <chr> <chr> <dbl>
## 1 Time_Start_Decimal Bombus_Queen 0.0646
## 2 Weather Bombus_Queen 0.587
## 3 Random: Population Bombus_Queen 0.116
## 4 Random: Year Bombus_Queen 0.168
## 5 Random: Sample Bombus_Queen 0.0641
## 6 Time_Start_Decimal Bombus_Worker 0.167
## 7 Weather Bombus_Worker 0.459
## 8 Random: Population Bombus_Worker 0.0927
## 9 Random: Year Bombus_Worker 0.209
## 10 Random: Sample Bombus_Worker 0.0719
## # ℹ 60 more rows
The data frame shows how the variance in pollinator abundance is partitioned across spatial, temporal, and environmental factors for each pollinator functional group. This organization allows you to compare how different factors explain variance across different groups of pollinators, making it easier to interpret the contribution of each source of variation.
<- data_VP_model_time_weather_variance_partition %>%
data_VP_model_time_weather_variance_partition_overall group_by(Source_Variance) %>%
summarise(Mean_Variance = mean(Variance_Proportion))
data_VP_model_time_weather_variance_partition_overall
## # A tibble: 5 × 2
## Source_Variance Mean_Variance
## <chr> <dbl>
## 1 Random: Population 0.192
## 2 Random: Sample 0.241
## 3 Random: Year 0.139
## 4 Time_Start_Decimal 0.0810
## 5 Weather 0.347
4.4 Graphs to Visualize Variance Partitioning Analysis
4.4.1 Overall Sources of Variance Across All Pollinator Functional Groups
%>%
data_VP_model_time_weather_variance_partition mutate(Source_Variance = factor(Source_Variance, levels = c("Random: Sample", "Random: Year", "Random: Population", "Weather", "Time_Start_Decimal"))) %>%
ggplot(aes(x = Source_Variance, y = Variance_Proportion, fill = Source_Variance)) +
geom_point(color = "black", position = position_jitter(width = 0.0), size = 2) +
geom_boxplot(alpha = 0.75, outlier.shape = NA) +
theme_bw(base_size = 15) +
theme(legend.position = "none",
legend.title = element_blank(),
legend.text = element_text(size = 15),
axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1, size = 15),
axis.text.y = element_text(size = 15),
panel.grid.minor.y = element_blank()) +
xlab("") +
ylab("Proportion of Explained Variance") +
scale_fill_manual(values = c("#FDE725", "#5DC963", "#21908D", "#3B528B", "#440154")) +
scale_x_discrete(labels=c("Random: Sample ID",
"Random: Year",
"Random: Population",
"Weather",
"Time")) +
stat_summary(fun = mean, geom = "point", size = 5, color = "black" ) +
ggtitle("Number of Visits to Flowers")
4.4.2 Specific Sources of Variance for Individual Pollinator Functional Groups
%>%
data_VP_model_time_weather_variance_partition mutate(Source_Variance = factor(Source_Variance, levels = c("Random: Sample", "Random: Year", "Random: Population", "Weather", "Time_Start_Decimal"))) %>%
ggplot(aes(x = Pollinator_Functional_Group, y = Variance_Proportion, fill = Source_Variance)) +
geom_bar(position = "stack", stat = "identity", color = "black") +
theme_bw(base_size = 15) +
theme(legend.position = "right",
legend.title = element_blank(),
legend.text = element_text(size = 15),
axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1, size = 15),
axis.text.y = element_text(size = 15),
panel.grid.minor.y = element_blank()) +
xlab("") +
ylab("Proportion of Explained Variance") +
scale_fill_manual(values = c("#FDE725", "#5DC963", "#21908D", "#3B528B", "#440154"),
labels = c("Random: Sample ID",
"Random: Year",
"Random: Population",
"Weather",
"Time")) +
scale_x_discrete(labels=c(expression(italic("Bombus") ~ " Queen"),
expression(italic("Bombus") ~ " Worker"),
expression(italic("Bombus terrestris") ~ " Worker"),
"Bombyliidae",
"Coleoptera",
"Diurnal Lepidoptera",
"Honeybee",
"Large Solitary Bee",
"Noctuidae",
"Non-Syrphid Diptera",
"Other",
"Small Solitary Bee",
"Sphingidae",
"Syrphidae")) +
ggtitle("Number of Visits to Flowers")
4.5 Questions Addressed By Variance Partitioning With HMSC
How much of the variation in species abundance is explained by spatial factors, temporal factors, and environmental factors?
What is the relative importance of spatial vs. temporal variation in explaining community structure?
Do different species exhibit similar spatial and temporal patterns of variation?
Are specific species more sensitive to particular environmental variables?
Which predictors explain the largest proportion of variance in community structure?