Modelling and predicting biogeographical patterns in river networks

Abstract. Statistical analysis and interpretation of biogeographical phenomena in rivers is now possible using a spatially explicit modelling framework, which has seen significant developments in the past dec-ade. I used this approach to identify a spatial extent (geostatistical range) in which the abundance of the parasitic freshwater pearl mussel ( Margaritifera margaritifera L.) is spatially autocorrelated in river networks. I show that biomass and abundance of host fish are a likely explanation for the autocorrelation in mussel abundance within a 15-km spatial extent. The application of universal kriging with the empirical model enabled precise prediction of mussel abundance within segments of river networks, something that has the potential to inform conservation biogeography. Although I used a variety of modelling ap-proaches in my thesis, I focus here on the details of this relatively new spatial stream network model, thus advancing the study of biogeographical patterns in river networks.


Introduction
I examine the biogeography of the river-dwelling freshwater pearl mussel (Margaritifera margaritifera L.), a species that passes a parasitic larval stage attached to a host fish. Two fish species are suitable hosts for the larvae in my study region, brown trout (Salmo trutta L.; migratory and resident ecotypes) and Atlantic salmon (Salmo salar L.; migratory ecotype). My study region encompasses 20 river drainages of Galicia in northwest Spain, approximately 30,000 km 2 . Previous publications from Lois (2015) include a description of sampling and data collection (Lois et al. 2014) and a spatial analysis of mussel abundance  aimed at identifying the relative importance of biotic interactions on the distribution and abundance of the parasitic pearl mussel. Here, I focus on the latter paper, which applied a relatively new spatially explicit model for river ecosystems (Ver Hoef et al. 2006, Ver Hoef and. My objective here is to show that this method advances river biogeography and that future developments could further expand this new frontier. Rivers have specific spatial characteristics that influence their biogeographical patterns. They are embedded in drainage basins and disper-sal of aquatic organisms is generally constrained by the structure and directional connectivity of the dendritic network. The majority of prior studies on rivers have used techniques developed for terrestrial environments (Isaak et al. 2014). Some have concluded that biodiversity patterns across drainage basins are similar to those in other isolated terrestrial environments such as islands or mountains (Hugueny 1989). However, rivers are dendritic networks that have a directional flow (Peterson et al. 2013) that defines pathways that connect environments and organisms. Thus, an on -network analysis (Peterson et al. 2013) might be applied to yield information about the spatial dependence of biodiversity within river networks.
Two features create spatial dependence within river networks. First, the flow of water in rivers transports material downstream, including organisms originating from the river or from the surrounding landscape (Ward et al. 2002, Wiens 2002. This process has been labelled the "tail-up" source of autocovariance in river networks (Cressie et al. 2006, Ver Hoef et al. 2006). Second, upstream movement of organisms against the river flow, as with fish for example (Cressie and O'Donnell 2010), has been labelled "tail-down" thesis abstract Modelling and predicting biogeographical patterns in river networks autocovariance (Ver Hoef et al. 2006, Ver Hoef and. Thus, any biogeographical study in rivers is likely to benefit by accounting for the contribution of these two ecosystem processes to species distribution and abundance.

Sampling and Data
Galicia, Spain contains many drainage basins with outlets to the ocean that provide opportunities to examine the distribution and abundance of freshwater organisms within and among basins. I used a two-phase doubly stratified sampling design to provide baseline information on the presence and abundance of M. margaritifera at 2436 geographic locations (Lois et al. 2014). A total of 435 records of mussel presence and abundance in 20 river networks were used in my thesis (Lois 2015). To summarize the environmental conditions within the region, I acquired 16 gridded predictor variables (spatial resolution of 40 x 40 m) belonging to three categories: (a) climate (average annual precipitation, average summer precipitation, average annual temperature, average summer temperature, maximum summer temperature, minimum summer temperature), (b) geology (granitic rocks, detrital deposits, metamorphic rocks), and (c) landform characteristics (slope, forest cover, elevation). I used a fourth category of predictor variables, abundance and biomass of the two host fish species (Atlantic salmon, migratory trout, resident trout, total salmonid biomass), compiled from the Fish Database of European Streams (Beier et al. 2007) to account for possible effects of host fish on abundance of the parasitic mussel. Further information about data sources and environmental predictors can be found in Lois (2015) and Lois et al. (2015).

Parasite-Host Model in River Networks
The freshwater pearl mussel with its host salmonids is an example of a parasite-host model that occurs globally. Freshwater mussels (order Unionoida) inhabit all continents except Antarctica (Bogan 2008) and they commonly have a life cycle where the larvae rely on host fish for their survival, growth, metamorphosis, and dispersal (Strayer 2008). Dispersal of parasitized host fish will often disperse juvenile mussels into new regions along the river. In an upstream direction, mussels can be dispersed into flow-unconnected tributaries so that a parasitic mussel species can come to occupy many branches in the dendritic network. After a juvenile mussel drops from the host fish, its future location is biased toward occupying a downstream location with displacement from the riverbed by high flows. In contrast, the long-lived adult mussel is a benthic filter feeder living partly buried in the riverbed. The adult mussel has very limited ability to move upstream in the river network and its muscular foot reduces likelihood of downstream displacement. With this parasitehost system, I investigate the effects of abiotic and biotic (host fish) factors on mussel abundance within the context of a spatially explicit model for river networks.

Modelling Parasite Abundance
For an appreciation of its utility in river biogeography, it is necessary to give a brief description of the geostatistical mixed model for river networks (Ver Hoef et al. 2006, Peterson et al. 2013. This spatial stream network model differs notably from many terrestrial applications of spatial mixed models that rely on Euclidean distances between geographical locations. In a river network, more appropriate metrics for quantifying spatial dependence include instream distance and whether or not pairs of sites share flow (flow-connected or flow-unconnected).
The spatial stream network model includes fixed effect predictors along with multiple random effects (autocovariates), whereas terrestrial applications of spatial mixed models typically only include one random effect, pairwise Euclidean distance. The stream network analysis is framed in a variance components perspective, so that one can see the relative amount of variance explained by fixed effects (abiotic and biotic predictors) and each random effect. As stated earlier, there are two spatially explicit autocovariates for river networks, tail-up and tail-down processes. Euclidean distance can also be included in a stream network model. However, prior studies of river networks have found that Euclidean distance explains very little of the total variance (e.g., Isaak et al. 2014). An estimate of the spatial extent (geostatistical range) at which autocorrelation occurs for tail-up and tail-down processes is obtained with the stream network model. The geostatistical mixed model for stream networks differs from a correlogram or semivariogram approach for accounting for spatial autocorrelation because the mixed model uses all the data for restricted maximum-likelihood estimation of model parameters [fixed effects (covariates) and random effects (partial sill and range for each autocovariate)]. In contrast, inferences from the correlogram (and also variogram) approach rely on the shape of the autocorrelation curve (Legendre 1993), which can be affected by how one chooses to bin distance values (Diniz-Filho et al. 2003).
The tail-up and tail-down components in the stream network model can be understood intuitively this way. The tail-up component considers the river from its outlet to its headwaters, a downstream-to-upstream view that only considers flow-connected sites in the network. In contrast, the tail-down component considers the river from the headwaters to the outlet, a perspective that utilizes flow-connected and flow-unconnected sites. In my analyses, I was particularly interested in the geostatistical range for the tail-down component, because mussel dispersal to flowunconnected locations only occurs in association with parasitism.

Results and Discussion
For the geostatistical mixed model, a spatial stream network (SSN) dataset containing 20 river networks was created using the STARS geoprocessing toolset . I included tail-up and tail-down random effects in the mixed model (Peterson et al. 2013). Analysis was made using the SSN package  in R (R Development Core Team 2016). The model explained 52% of the variance in mussel abundance; two biotic predictors (salmonid biomass and resident trout density) were the only significant fixed effects, explaining 2% of the variance. In contrast, the tail-up and the tail-down components of spatial covariance explained 38% and 12%, respectively. Thus, approximately three times more variation in mussel abundance was explained by the downstream-to-upstream perspective in contrast to the upstream-todownstream perspective, which suggests passive downstream processes are more important in determining mussel abundance in river networks.
Relative to the results for the full dataset, upstream processes (fish movements) were suggested to be more important in determining mussel abundance where migratory host fish were present. I obtained this result by analyzing two subsets of data representing locations with (n = 161) and without (n = 274) migratory host fish, which approximates the fragmentation of host fish populations by dams in the different basins of the study region. The amount of variance explained for these two subsets of data differed markedly. Mussel abundance was higher and ca. 25% more variance in mussel abundance was explained where migratory host fish were present than where they were absent. Where migratory host fish were present, 78% of the variance in mussel abundance was explained; fixed effects accounted for 23%, tail-up for 23% and tail-down 31%, thereby suggesting a relatively greater importance for parasitized fish movements. Several additional lines of evidence support migratory host fish as having an essential positive effect on mussel abundance in this parasite-host system (Schwalb et al. 2011, Lois 2015.
The results indicate an important negative impact of dams excluding migratory host fish from mussel populations.
The spatial extent of autocorrelation in mussel abundance in river networks was estimated using geostatistical range (Isaak et al. 2014). For the full dataset of mussel abundance, the range of spatial autocorrelation was ca. 17 km for the tail-up component and ca. 0.7 km for the taildown. In contrast, the subset for presence of migratory host fish had larger range for tail-down (16 km) than tail-up (0.1 km). Thus, my results suggest that the spatial extent over which biotic interactions affect mussel abundance is greater than 15 km. This finding identifies the scale at which conservation efforts directed to simultane-ously managing parasite and host populations would be most effective. This endeavor would conserve a biotic interaction, critical for the survival of the parasite population. The geostatistical mixed model for spatial stream networks provided predictions of mussel abundance at a 1-km spatial resolution using universal kriging (Krige 1966). The kriging-based estimates of mussel abundance gave a spatially continuous perspective that accounted for the spatial extent of biotic interactions in the 20 river networks. As an example for conservation application, I combined kriging predictions with data on population age structures to identify four categories of river networks for conservation planning (Lois 2015). The ability to predict features of biodiversity at an intermediate spatial resolution (e.g., 1-10 km) can facilitate designing conservation regions, targeting areas for restoration, or identifying critical areas of human impact (Lois 2015).
The spatial stream network technique has now been used to analyze physico-chemical stream data, fish counts, macroinvertebrate data (e.g., Peterson and Ver Hoef 2010, Isaak et al. 2014, Frieden et al. 2014) and biotic interactions . Future developments in spatial stream network models are likely to include extensions to population genetic data and development of multivariate models to study the biogeography of riverine biotic communities. Spatial stream network models are poised to exploit the current growth of freshwater biodiversity databases, enabling macro-scale analyses of freshwater biodiversity in river networks. The future holds exciting possibilities for understanding biotic interactions and other biogeographical processes in riverine environments. Broader application of the model-based analyses described above has the potential to expand the frontier of river biogeography.