Relationship between local population density and environ ‐ mental suitability estimated from occurrence data

Figure 1. Idealized triangular relationship between suit‐ ability and abundance. Two vectors – one representing abundance (y) and another suitability (x) – were gener‐ ated. In the case of x, 1000 values ranging between 0 and 1 were randomly extracted from a uniform distribu‐ tion. In the case of y, 1000 values were also randomly extracted from a uniform distribution following the con‐ dition 0 ≤ yi ≤ xi (i = {1, ..., 1000}). Linear quantile regres‐ sions were fitted to the 97.5 percentile (red dashed line). From 1000 simulations, the mean goodness‐of‐fit statistic for quantile regressions (R; Koenker and Machado 1999) was computed, with a value of 0.60. Analyses were done in R using the quantreg package (Koenker 2009). ISSN 1948‐6596 opinion and perspectives

Idealized triangular relationship between suitability and abundance. Two vectors -one representing abundance (y) and another suitability (x) -were generated. In the case of x, 1000 values ranging between 0 and 1 were randomly extracted from a uniform distribution. In the case of y, 1000 values were also randomly extracted from a uniform distribution following the condition 0 ≤ y i ≤ x i (i = {1, …, 1000}). Linear quantile regressions were fitted to the 97.5 th percentile (red dashed line). From 1000 simulations, the mean goodness-of-fit statistic for quantile regressions (R 1 ; Koenker and Machado 1999) was computed, with a value of 0.60. Analyses were done in R using the quantreg package (Koenker 2009). ISSN 1948 opinion and perspectives The spatial configuration of species' abundance has been a topic of discussion in ecology for a century (see Gaston 2003). On several occasions, the fact that abundance peaks at the centre of the geographic ranges has been reported and has even been accepted as rule of thumb (Sagarin and Gaines 2002). Assuming that local abundance is a by-product of Hutchinsonian niche processes, and that some of the most determinant variables demonstrate spatial autocorrelation and some interindependence, Brown (1984) argued that the high abundances found at the centre of the ranges of distribution are to be expected. However, the pattern is far from general (Sagarin and Gaines 2002;Gaston 2009); there are several contingent and demographic causes that may disrupt the "abundant centre" distribution hypothesis (Rapoport 1975;Brown 1984). Even when abundance truly reflects environmental suitability, given the potential complexity of the structure and geographic patterns of niches (Soberón and Nakamura 2009), violations of the single and centred environmental optimum may be the rule rather than the exception. Species distribution models (SDMs) have become popular in recent years, in part because of the increased availability of occurrence data in huge biological databases. Within the usual terms of "ecological niche models" or "habitat suitability models" lies the assumption that, since occurrence data reflect some parts of the niches, these methods estimate something close to the environmental suitability (herein suitability for short) of the species (Soberón and Nakamura 2009). Given that occurrence data are much easier to compile than abundance, an obvious question that arises is whether local suitability can inform us about local abundance. Curiously, few studies have directly explored this topic, and the results of the investigations of Pearce and Ferrier (2001), Nielsen et al. (2005) and Jiménez-Valverde et al. (2009) are not encouraging at all.
The study of VanDerWal et al. (2009) went a step further towards understanding the relationship between suitability and abundance. These authors suggested that the relationship between suitability and abundance should be triangular (see Fig. 1). They reasoned that, since many factors may reduce the theoretical maximum density that a species can reach at a certain location, suitability should determine the maximum limit instead of the mean abundance. The authors modelled the distribution of 69 species of vertebrates in the Australian wet tropics region. Presence data for the species were compiled from a variety of sources and several climate and vegetationrelated variables were used to fit the SDMs. The authors applied Maxent, a technique that has gained surprising popularity recently, and that uses presence and background (i.e., locations with no information about the occurrence of the focus species) data. Then they studied the relationship between suitability and abundance data derived from standardized sampling.
To test for the expected polygonal relationship, the authors argued that ordinary least square regressions were not suitable. In fact, suitability only accounted for 12% (mean) of the variation in vertebrates' mean abundance. Certainly, suitability was not a good surrogate of abundance. They applied quantile regressions (QR), which are appropriate for dealing with the unequal variation in ecological data associated with limiting variables (Cade and Noon 2003). VanDerWal and collaborators found evidence for a positive relationship between the upper limit of local abundance and suitability. Here, I want to stress four concerns.
First, the calculation of an R-squared measure (explained variance) makes little sense in a QR context (see FAQ() in Koenker 2009). While in a mean regression (ordinary least squares regression) the sum of squared residuals is minimized, in a median regression (QR fitted to the 50 th percentile) what is minimized is the sum of absolute residuals. Thus, in QR, the interest is on the weighted sum of absolute residuals (R 1 ) as a local measure of the goodness-of-fit for a unique quantile (Koenker and Machado 1999). Second, the reported values of R 1 are not generally very large (mean value lower than 0.2), although it has to be taken into account that for an idealized perfect triangular relationship the maximum R 1 that a lin-ear QR fitted to the 97.5 th percentile can yield is 0.60 (not 1; see Fig. 1). Third, fitting a function with a fixed intercept at point (0, 0) (as done by the authors with nonlinear QR -one of several functions fitted) may yield statistically significant but at the same time spurious results (see Fig. 2). Finally, VanDerWal and collaborators applied a threshold to convert continuous maps into presence-absence maps, and excluded locations falling below that threshold (i.e. locations with low suitability values) from the abundance-suitability regression analysis. This allowed them to avoid the inclusion of a potentially large number of zero counts in locations with very low suitability values (in which case a positive abundance-suitability relationship may be indicating good discrimination capacity of the SDMs rather than capacity to account for abundance) while still taking into ac- intercept. Two uncorrelated vectors -one representing the abundance (y) and another the suitability (x) -were generated. In the case of y, 100 values ranging between 0 and 10 were randomly extracted from a uniform distribution. In the case of x, 100 values ranging between 0.3 and 1 were also randomly extracted from a uniform distribution. Nonlinear quantile regressions were fitted to the 95 th (red dashed line) and 50 th (blue dashed line) percentiles using the function y = [max(y)-exp(-bx)] (in this example, max(y) = 10). The parameter b was estimated as 6.84 (95 th ) and 1.20 (50 th ) and was highly significant in both cases (p<0.0001). Analyses were done in R using the quantreg package (Koenker 2009). Suitability Abundance local density, environmental suitability and occurrence data count absence data within the potential range of the species. However, it would be very informative to know the extent to which the thresholding process excluded locations with very low suitability values but with non-zero species' abundance. In any case, the fact that linear QR showed higher significant slopes in the upper percentiles, and that R 1 progressively increased, hints at the idea of a polygonal relationship. Clearly, the study of VanDerWal et al. has brought to light a new perspective in the way to approach the analysis of the "abundance-suitability" pattern. Given the difficulty of obtaining abundance data and its potential relevance in conservation practices (Brown et al. 1995), many more studies in this line of research are required to discern under which circumstances, if any, the pattern holds.
Finally, an issue that worries me about the general idea of the "abundance-suitability" pattern is that (assuming well designed sampling, yielding unbiased occurrence data), if the density of points across the grid cells is a reflection of abundance, then a positive relationship between suitability and abundance may be a trivial fact. (Note that, as is usually done in SDM studies, Van-DerWal and collaborators aggregated multiple observations at the same location, which certainly helped avoid tautological relationships, but this does not invalidate my argument: I am referring to the density of points across the grid cells). My reasoning is that it seems justified to assume that the abundance affects the probability of detecting the species. Therefore, if the probability of detection determines the geographic density of presence points across the grid cells, is the positive relationship between local abundance and suitability that is estimated from well calibrated SDMs a revelatory pattern or just the consequence of circular reasoning?